Rust: Lift content reads as taint steps #20879

paldepind · 2025-11-20T16:13:55Z

Let read steps give rise to taint steps. This has the effect that if foo is tainted and an operation reads from foo (e.g., foo.bar) then taint is propagated.

We limit this to not apply if the type of the operation is a small primitive type as these are often uninteresting (for instance in the case of an injection query).

This PR lifts readContentStep instead of readStep. The latter subsumes the former and additionally includes reads from flow summaries. Doing the type based restriction for these wasn't completely trivial and including them without such a restriction caused spurious results in some of the tests. If anyone has an idea on how to do the type restriction for those, then we can do that in follow up work.

Copilot

Pull request overview

This PR enhances Rust taint tracking by lifting content reads as taint steps, enabling automatic taint propagation when reading from tainted values (e.g., foo.bar inherits taint from foo). The implementation filters out small primitive types (numerics, booleans, characters) to avoid spurious results in injection queries.

Key changes:

Added logic to propagate taint through readContentStep operations with type-based filtering
Simplified actix-web models by removing redundant field-specific taint summaries
Updated test expectations to reflect newly detected taint flows

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`rust/ql/lib/codeql/rust/dataflow/internal/TaintTrackingImpl.qll`	Implements taint propagation through read operations with primitive type filtering
`rust/ql/lib/codeql/rust/frameworks/actix-web.model.yml`	Removes redundant field-specific taint models that are now handled automatically
`rust/ql/test/library-tests/dataflow/sources/web_frameworks/test.rs`	Updates test annotations to reflect newly detected taint flows in web framework handlers
`rust/ql/test/library-tests/dataflow/sources/web_frameworks/InlineFlow.expected`	Updates expected flow analysis results with new taint propagation edges
`rust/ql/test/library-tests/dataflow/sources/net/test.rs`	Updates network test annotations for newly detected flows
`rust/ql/test/library-tests/dataflow/sources/net/InlineFlow.expected`	Updates expected network flow results
`rust/ql/test/library-tests/dataflow/sources/file/InlineFlow.expected`	Updates expected file I/O flow results
`rust/ql/test/library-tests/dataflow/sources/env/test.rs`	Updates environment variable test annotations
`rust/ql/test/library-tests/dataflow/sources/env/InlineFlow.expected`	Updates expected environment flow results
`rust/ql/test/library-tests/dataflow/sources/database/test.rs`	Updates database test annotations
`rust/ql/test/library-tests/dataflow/sources/database/InlineFlow.expected`	Updates expected database flow results
`rust/ql/test/query-tests/security/CWE-825/AccessAfterLifetime.expected`	Updates lifetime analysis test expectations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

hvitved

Performance doesn't look good on rust.

geoffw0 · 2025-11-25T17:37:16Z

We limit this to not apply if the type of the operation is a small primitive type as these are often uninteresting (for instance in the case of an injection query).

Is that because we get unwanted results, or because we get better performance without considering these?

paldepind · 2025-11-26T08:44:23Z

Is that because we get unwanted results, or because we get better performance without considering these?

The idea is that it should achieve both and be appropriate for most queries.

We want to avoid cases where some value foo of type Foo { size: usize, dangerus_content: T } is tainted, which leads to foo.size being tainted, and we end up producing results in queries where it makes no sense.

I think this can always be tweaked on a per query basis by using isAdditionalFlowStep and barriers. So it's actually more about finding a sensible default for most queries.

geoffw0 · 2025-11-26T13:58:23Z

I think this can always be tweaked on a per query basis by using isAdditionalFlowStep and barriers.

We might want to add barriers to the two pointer queries, perhaps excluding flow through non-pointer/reference types or something to that effect. But I haven't finished looking into the result changes yet.

…enum types

geoffw0

Its great that we're getting a good number of new results here in both the tests and DCA projects. Test changes look fantastic. I've only analyzed a handful of the new DCA results (because its quite time consuming to do so in detail), but I didn't spot any issues with the new flow from this PR in those either. I would appreciate if someone would look through a few more of them.

QL changes LGTM aside from a few nits (below).

How is performance now?

Definitely needs a change note.

geoffw0 · 2025-11-27T15:45:10Z