Add taint-tracking for archive/tar and archive/zip #251

gagliardetto · 2020-07-08T09:28:26Z

Part of #167

max-schaefer

A few thoughts about naming. I'm a bit unsure about some of the taint steps; see inline comments. I'll also kick off an evaluation to see what it does for performance and results.

ql/src/semmle/go/frameworks/stdlib/ArchiveZipTaintTracking.qll

ql/src/semmle/go/frameworks/Stdlib.qll

ql/src/semmle/go/frameworks/stdlib/ArchiveTarTaintTracking.qll

gagliardetto · 2020-07-10T10:39:35Z

Thanks for the review, @max-schaefer

I updated the naming, and the taint logic.

max-schaefer · 2020-07-10T14:49:02Z

ql/test/library-tests/semmle/go/frameworks/StdlibTaintFlow/ArchiveTar.go

@@ -0,0 +1,210 @@
+// WARNING: This file was automatically generated. DO NOT EDIT.


Is there any way to make these tests a little less verbose?

I'm also growing increasingly fond of tests with inline annotations where the expected output of the test is in some way encoded in the test itself. For example, you are already using source and sink functions to model expected sources and sinks. If you give them an additional integer argument, you could establish the convention that taint from source(n) should flow to sink(n, ...). The test query would then simply check that there is such flow and no other flow from source(n) to an unrelated sink(n', ...).

This has the advantage that you can see the expected test output directly from the test itself and don't have to refer back and forth between the .go file and the .expected file. In fact, the .expected file would be completely empty.

You can see an example of this style in our call graph tests.

Is there any way to make these tests a little less verbose?

Comments can be removed.

This has the advantage that you can see the expected test output directly from the test itself and don't have to refer back and forth between the .go file and the .expected file. In fact, the .expected file would be completely empty.

Does that mean that tests would only have the results that don't have the flow from a source(n) to a sink(n,...)?

That's very cool. I'll implement that. I'm very fond of automation that removes manual steps.

max-schaefer · 2020-07-10T14:49:29Z

Evaluation showed no changes in performance or results.

sauyon · 2020-07-13T14:40:13Z

ql/test/library-tests/semmle/go/frameworks/StdlibTaintFlow/ArchiveTar.go

@@ -0,0 +1,105 @@
+// WARNING: This file was automatically generated. DO NOT EDIT.


Generated Go code should have a comment that matches ^// Code generated .* DO NOT EDIT\.$ as documented here: https://golang.org/pkg/cmd/go/internal/generate/; could you change the comment accordingly?

Also perhaps mention in the comment how they were autogenerated, so I know where to go if I want to alter them?

That's a great idea.

I'll mention the repo. Then, I think the README file of the repo should be enough to reproduce.

For the taint-tracking of the packages in this PR, these are the commands:

codebox --out-dir=./generated/latest --pkg=/usr/local/go/src/archive/tar # and codebox --out-dir=./generated/latest --pkg=/usr/local/go/src/archive/zip

If you want to edit the taint logic, add --http and then go to http://127.0.0.1:8080/

codebox --out-dir=./generated/latest --pkg=/usr/local/go/src/archive/tar --http # and codebox --out-dir=./generated/latest --pkg=/usr/local/go/src/archive/zip --http

sauyon · 2020-07-13T14:40:58Z

ql/test/library-tests/semmle/go/frameworks/StdlibTaintFlow/ArchiveTar.go

+		source := newSource(0)
+		out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source)
+		sink(0, out)


I think it would make the tests cleaner to just inline these functions; is there a reason you haven't?

What do you mean, exactly?

Why do you have separate functions for each step? It seems like you could just write the step itself into these blocks, like so:

{ source = newSource(0) var intoWriter881 tar.Writer intoWriter881.WriteHeader(source) sink(intoWriter881) }

It would be much more compact and probably more informative when a test fails.

That's definitely possible, but I prefer named functions; they keep things clean and ordered, and I can easily understand where and for what each function was generated (see the naming suffix).

Also, less visual fatigue in case of a review, not to mention a lot less scrolling in case of a 100+ test cases.

I'll draft an implentation of that tomorrow, and share the results of what it looks like.

Are you still looking into this? I'd be fine with leaving the tests as is (though I don't necessarily agree with your reasoning; the generated function names do not look very pretty, and I find a list of 100+ functions at least as tiresome to review as a list of 100+ code blocks).

Alternative suggestion:

Suggested change

source := newSource(0)

out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source)

sink(0, out)

sink(0, TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(newSource(0))

You might then even be able to rename newSource to source, which would be nicely parallel to sink.

smowton · 2020-07-15T15:15:23Z

Re: the CI failure, note we now have autoformatting for Go as well as QL. After a rebase, try make autoformat in the root of the repo

smowton

Low-confidence review: lgtm

gagliardetto · 2020-07-15T16:18:47Z

Re: the CI failure, note we now have autoformatting for Go as well as QL. After a rebase, try make autoformat in the root of the repo

The issue was a go file I edited from the browser on GH.

Fixed and pushed, but github.com seems to be having system issues.

smowton · 2020-07-15T16:43:36Z

Yeah, @gagliardetto you can get a notification when it's fixed from https://www.githubstatus.com/

max-schaefer

Basically LGTM, modulo an optional suggestion.

max-schaefer · 2020-07-27T13:43:23Z

ql/test/library-tests/semmle/go/frameworks/StdlibTaintFlow/ArchiveTar.go

+		source := newSource(0)
+		out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source)
+		sink(0, out)


Are you still looking into this? I'd be fine with leaving the tests as is (though I don't necessarily agree with your reasoning; the generated function names do not look very pretty, and I find a list of 100+ functions at least as tiresome to review as a list of 100+ code blocks).

max-schaefer · 2020-07-27T13:44:25Z

ql/test/library-tests/semmle/go/frameworks/StdlibTaintFlow/ArchiveTar.go

+		source := newSource(0)
+		out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source)
+		sink(0, out)


Alternative suggestion:

Suggested change

source := newSource(0)

out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source)

sink(0, out)

sink(0, TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(newSource(0))

max-schaefer · 2020-07-27T13:46:34Z

ql/test/library-tests/semmle/go/frameworks/StdlibTaintFlow/ArchiveTar.go

+		source := newSource(0)
+		out := TaintStepTest_ArchiveTarFileInfoHeader_B0I0O0(source)
+		sink(0, out)


You might then even be able to rename newSource to source, which would be nicely parallel to sink.

gagliardetto · 2020-07-28T10:44:49Z

Are you still looking into this? I'd be fine with leaving the tests as is (though I don't necessarily agree with your reasoning; the generated function names do not look very pretty, and I find a list of 100+ functions at least as tiresome to review as a list of 100+ code blocks).

I'm short on time to implement the modified code generation. I suggest moving on with the PRs, and then, at a certain point, I'll open a PR with the updated golang test code for the previous merged PRs.

max-schaefer · 2020-07-28T13:26:57Z

OK, in that case I'll merge. As always, thanks very much for your contribution!

gagliardetto mentioned this pull request Jul 8, 2020

Expand Taint-Tracking to include 67 std-lib packages (with tests) #167

Closed

max-schaefer suggested changes Jul 9, 2020

View reviewed changes

max-schaefer reviewed Jul 10, 2020

View reviewed changes

sauyon reviewed Jul 13, 2020

View reviewed changes

smowton previously approved these changes Jul 15, 2020

View reviewed changes

gagliardetto added 7 commits July 15, 2020 19:05

Add taint-tracking for archive/tar and archive/zip

19287fb

Add StdlibTaintFlow.expected

5b63228

Implement code review feedback

1591ed3

Simplify tests

19348d2

Update main.go

f7a03c0

Generated Go files: add what they were generated with

9cd86f9

Fix go autoformat

437f4b7

gagliardetto dismissed smowton’s stale review via 437f4b7 July 15, 2020 16:40

gagliardetto force-pushed the standard-lib-pt-1 branch from 3a7a5a5 to 9cd86f9 Compare July 15, 2020 16:49

max-schaefer approved these changes Jul 27, 2020

View reviewed changes

max-schaefer merged commit e9ae697 into github:master Jul 28, 2020

owen-mc added a commit to owen-mc/codeql-go that referenced this pull request Aug 26, 2020

Add change note for github#251

ad6c94e

gagliardetto mentioned this pull request Sep 2, 2020

Add taint-tracking for reflect package #317

Merged

Apr	MAY	Jun
	22
2024	2025	2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add taint-tracking for archive/tar and archive/zip #251

Add taint-tracking for archive/tar and archive/zip #251

gagliardetto commented Jul 8, 2020 •

edited

Loading

max-schaefer left a comment

gagliardetto commented Jul 10, 2020

max-schaefer Jul 10, 2020

gagliardetto Jul 10, 2020

gagliardetto Jul 10, 2020

max-schaefer commented Jul 10, 2020

sauyon Jul 13, 2020

gagliardetto Jul 14, 2020

smowton Jul 15, 2020

gagliardetto Jul 15, 2020

gagliardetto Jul 15, 2020

sauyon Jul 13, 2020

gagliardetto Jul 14, 2020

sauyon Jul 15, 2020

gagliardetto Jul 15, 2020

gagliardetto Jul 15, 2020

max-schaefer Jul 27, 2020

max-schaefer Jul 27, 2020

max-schaefer Jul 27, 2020

smowton commented Jul 15, 2020

smowton left a comment

gagliardetto commented Jul 15, 2020

smowton commented Jul 15, 2020

max-schaefer left a comment

max-schaefer Jul 27, 2020

max-schaefer Jul 27, 2020

max-schaefer Jul 27, 2020

gagliardetto commented Jul 28, 2020

max-schaefer commented Jul 28, 2020

		@@ -0,0 +1,210 @@
		// WARNING: This file was automatically generated. DO NOT EDIT.

		@@ -0,0 +1,105 @@
		// WARNING: This file was automatically generated. DO NOT EDIT.

Add taint-tracking for archive/tar and archive/zip #251

Add taint-tracking for archive/tar and archive/zip #251

Conversation

gagliardetto commented Jul 8, 2020 • edited Loading

max-schaefer left a comment

Choose a reason for hiding this comment

gagliardetto commented Jul 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

max-schaefer commented Jul 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

smowton commented Jul 15, 2020

smowton left a comment

Choose a reason for hiding this comment

gagliardetto commented Jul 15, 2020

smowton commented Jul 15, 2020

max-schaefer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gagliardetto commented Jul 28, 2020

max-schaefer commented Jul 28, 2020

gagliardetto commented Jul 8, 2020 •

edited

Loading