-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Rust: Lift content reads as taint steps #20879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b5453a5 to
5635afc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enhances Rust taint tracking by lifting content reads as taint steps, enabling automatic taint propagation when reading from tainted values (e.g., foo.bar inherits taint from foo). The implementation filters out small primitive types (numerics, booleans, characters) to avoid spurious results in injection queries.
Key changes:
- Added logic to propagate taint through
readContentStepoperations with type-based filtering - Simplified actix-web models by removing redundant field-specific taint summaries
- Updated test expectations to reflect newly detected taint flows
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
rust/ql/lib/codeql/rust/dataflow/internal/TaintTrackingImpl.qll |
Implements taint propagation through read operations with primitive type filtering |
rust/ql/lib/codeql/rust/frameworks/actix-web.model.yml |
Removes redundant field-specific taint models that are now handled automatically |
rust/ql/test/library-tests/dataflow/sources/web_frameworks/test.rs |
Updates test annotations to reflect newly detected taint flows in web framework handlers |
rust/ql/test/library-tests/dataflow/sources/web_frameworks/InlineFlow.expected |
Updates expected flow analysis results with new taint propagation edges |
rust/ql/test/library-tests/dataflow/sources/net/test.rs |
Updates network test annotations for newly detected flows |
rust/ql/test/library-tests/dataflow/sources/net/InlineFlow.expected |
Updates expected network flow results |
rust/ql/test/library-tests/dataflow/sources/file/InlineFlow.expected |
Updates expected file I/O flow results |
rust/ql/test/library-tests/dataflow/sources/env/test.rs |
Updates environment variable test annotations |
rust/ql/test/library-tests/dataflow/sources/env/InlineFlow.expected |
Updates expected environment flow results |
rust/ql/test/library-tests/dataflow/sources/database/test.rs |
Updates database test annotations |
rust/ql/test/library-tests/dataflow/sources/database/InlineFlow.expected |
Updates expected database flow results |
rust/ql/test/query-tests/security/CWE-825/AccessAfterLifetime.expected |
Updates lifetime analysis test expectations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
hvitved
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance doesn't look good on rust.
e66a251 to
28a2b60
Compare
Is that because we get unwanted results, or because we get better performance without considering these? |
The idea is that it should achieve both and be appropriate for most queries. We want to avoid cases where some value I think this can always be tweaked on a per query basis by using |
28a2b60 to
89d736d
Compare
We might want to add barriers to the two pointer queries, perhaps excluding flow through non-pointer/reference types or something to that effect. But I haven't finished looking into the result changes yet. |
89d736d to
957a5fd
Compare
957a5fd to
5d6e52a
Compare
geoffw0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its great that we're getting a good number of new results here in both the tests and DCA projects. Test changes look fantastic. I've only analyzed a handful of the new DCA results (because its quite time consuming to do so in detail), but I didn't spot any issues with the new flow from this PR in those either. I would appreciate if someone would look through a few more of them.
QL changes LGTM aside from a few nits (below).
How is performance now?
Definitely needs a change note.
| * Holds if the field `field` should, by default, be excluded from taint steps. | ||
| * The syntax used to denote the field is the same as for `Field` in | ||
| * models-as-data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be slightly more specific about what we mean here, perhaps something like:
| * Holds if the field `field` should, by default, be excluded from taint steps. | |
| * The syntax used to denote the field is the same as for `Field` in | |
| * models-as-data. | |
| * Holds if the field `field` should, by default, be excluded from taint steps | |
| * from the containing type to reads of the field. The models-as-data syntax | |
| * used to denote the field is the same as for `Field[]` access path elements. |
| // is tainted and an operation reads from `foo` (e.g., `foo.bar`) then | ||
| // taint is propagated. We limit this to not apply if the type of the | ||
| // operation is a small primitive type as these are often uninteresting | ||
| // (for instance in the case of an injection query). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not completely convinced we should apply this limitation here, rather than adding barriers for these types to all injection queries. I haven't seen the kinds of results you're talking about though, other than in the pseudocode example you gave on this PR discussion. Perhaps we could plan to look into this as follow-up, since doing this would mostly add even more flow to what we're already doing here.
| s instanceof Builtins::Bool or | ||
| s instanceof Builtins::Char | ||
| ) | ||
| ) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The forex means that we will only get a result if type inference succeeds on succ. Is this on purpose?
| ) | ||
| ) and | ||
| not excludedTaintStepContent(c) and | ||
| not TypeInference::inferType(succ.asExpr()).(Type::EnumType).getEnum().isFieldless() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could move this line into the forex and reduce it to:
and
not t.(Type::EnumType).getEnum().isFieldless()
| ) | ||
| or | ||
| // Let all read steps (including those from flow summaries and those that | ||
| // result in small primitive types) give rise to taint steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've lost some context from the original comment here, I think.
Let read steps give rise to taint steps. This has the effect that if
foois tainted and an operation reads fromfoo(e.g.,foo.bar) then taint is propagated.We limit this to not apply if the type of the operation is a small primitive type as these are often uninteresting (for instance in the case of an injection query).
This PR lifts
readContentStepinstead ofreadStep. The latter subsumes the former and additionally includes reads from flow summaries. Doing the type based restriction for these wasn't completely trivial and including them without such a restriction caused spurious results in some of the tests. If anyone has an idea on how to do the type restriction for those, then we can do that in follow up work.