-
Notifications
You must be signed in to change notification settings - Fork 125
Add taint-tracking for archive/tar and archive/zip #251
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts about naming. I'm a bit unsure about some of the taint steps; see inline comments. I'll also kick off an evaluation to see what it does for performance and results.
Thanks for the review, @max-schaefer I updated the naming, and the taint logic. |
@@ -0,0 +1,210 @@ | |||
// WARNING: This file was automatically generated. DO NOT EDIT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to make these tests a little less verbose?
I'm also growing increasingly fond of tests with inline annotations where the expected output of the test is in some way encoded in the test itself. For example, you are already using source
and sink
functions to model expected sources and sinks. If you give them an additional integer argument, you could establish the convention that taint from source(n)
should flow to sink(n, ...)
. The test query would then simply check that there is such flow and no other flow from source(n)
to an unrelated sink(n', ...)
.
This has the advantage that you can see the expected test output directly from the test itself and don't have to refer back and forth between the .go
file and the .expected
file. In fact, the .expected
file would be completely empty.
You can see an example of this style in our call graph tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to make these tests a little less verbose?
Comments can be removed.
This has the advantage that you can see the expected test output directly from the test itself and don't have to refer back and forth between the .go file and the .expected file. In fact, the .expected file would be completely empty.
Does that mean that tests would only have the results that don't have the flow from a source(n)
to a sink(n,...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's very cool. I'll implement that. I'm very fond of automation that removes manual steps.
Evaluation showed no changes in performance or results. |
@@ -0,0 +1,105 @@ | |||
// WARNING: This file was automatically generated. DO NOT EDIT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generated Go code should have a comment that matches ^// Code generated .* DO NOT EDIT\.$
as documented here: https://golang.org/pkg/cmd/go/internal/generate/; could you change the comment accordingly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also perhaps mention in the comment how they were autogenerated, so I know where to go if I want to alter them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll mention the repo. Then, I think the README file of the repo should be enough to reproduce.
For the taint-tracking of the packages in this PR, these are the commands:
codebox --out-dir=./generated/latest --pkg=/usr/local/go/src/archive/tar
# and
codebox --out-dir=./generated/latest --pkg=/usr/local/go/src/archive/zip
If you want to edit the taint logic, add --http
and then go to http://127.0.0.1:8080/
codebox --out-dir=./generated/latest --pkg=/usr/local/go/src/archive/tar --http
# and
codebox --out-dir=./generated/latest --pkg=/usr/local/go/src/archive/zip --http
source := newSource(0) | ||
out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source) | ||
sink(0, out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make the tests cleaner to just inline these functions; is there a reason you haven't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean, exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you have separate functions for each step? It seems like you could just write the step itself into these blocks, like so:
{
source = newSource(0)
var intoWriter881 tar.Writer
intoWriter881.WriteHeader(source)
sink(intoWriter881)
}
It would be much more compact and probably more informative when a test fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's definitely possible, but I prefer named functions; they keep things clean and ordered, and I can easily understand where and for what each function was generated (see the naming suffix).
Also, less visual fatigue in case of a review, not to mention a lot less scrolling in case of a 100+ test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll draft an implentation of that tomorrow, and share the results of what it looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you still looking into this? I'd be fine with leaving the tests as is (though I don't necessarily agree with your reasoning; the generated function names do not look very pretty, and I find a list of 100+ functions at least as tiresome to review as a list of 100+ code blocks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative suggestion:
source := newSource(0) | |
out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source) | |
sink(0, out) | |
sink(0, TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(newSource(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might then even be able to rename newSource
to source
, which would be nicely parallel to sink
.
Re: the CI failure, note we now have autoformatting for Go as well as QL. After a rebase, try |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Low-confidence review: lgtm
The issue was a go file I edited from the browser on GH. Fixed and pushed, but github.com seems to be having system issues. |
Yeah, @gagliardetto you can get a notification when it's fixed from https://www.githubstatus.com/ |
3a7a5a5
to
9cd86f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM, modulo an optional suggestion.
source := newSource(0) | ||
out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source) | ||
sink(0, out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you still looking into this? I'd be fine with leaving the tests as is (though I don't necessarily agree with your reasoning; the generated function names do not look very pretty, and I find a list of 100+ functions at least as tiresome to review as a list of 100+ code blocks).
source := newSource(0) | ||
out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source) | ||
sink(0, out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative suggestion:
source := newSource(0) | |
out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source) | |
sink(0, out) | |
sink(0, TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(newSource(0)) |
source := newSource(0) | ||
out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source) | ||
sink(0, out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might then even be able to rename newSource
to source
, which would be nicely parallel to sink
.
I'm short on time to implement the modified code generation. I suggest moving on with the PRs, and then, at a certain point, I'll open a PR with the updated golang test code for the previous merged PRs. |
OK, in that case I'll merge. As always, thanks very much for your contribution! |
Part of #167