Skip to content

Conversation

@paldepind
Copy link
Contributor

@paldepind paldepind commented Nov 20, 2025

Let read steps give rise to taint steps. This has the effect that if foo is tainted and an operation reads from foo (e.g., foo.bar) then taint is propagated.

We limit this to not apply if the type of the operation is a small primitive type as these are often uninteresting (for instance in the case of an injection query).

This PR lifts readContentStep instead of readStep. The latter subsumes the former and additionally includes reads from flow summaries. Doing the type based restriction for these wasn't completely trivial and including them without such a restriction caused spurious results in some of the tests. If anyone has an idea on how to do the type restriction for those, then we can do that in follow up work.

@github-actions github-actions bot added the Rust Pull requests that update Rust code label Nov 20, 2025
@paldepind paldepind force-pushed the rust/reads-as-taint branch 5 times, most recently from b5453a5 to 5635afc Compare November 21, 2025 14:08
@paldepind paldepind marked this pull request as ready for review November 21, 2025 14:10
@paldepind paldepind requested a review from a team as a code owner November 21, 2025 14:10
Copilot AI review requested due to automatic review settings November 21, 2025 14:10
Copilot finished reviewing on behalf of paldepind November 21, 2025 14:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances Rust taint tracking by lifting content reads as taint steps, enabling automatic taint propagation when reading from tainted values (e.g., foo.bar inherits taint from foo). The implementation filters out small primitive types (numerics, booleans, characters) to avoid spurious results in injection queries.

Key changes:

  • Added logic to propagate taint through readContentStep operations with type-based filtering
  • Simplified actix-web models by removing redundant field-specific taint summaries
  • Updated test expectations to reflect newly detected taint flows

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated no comments.

Show a summary per file
File Description
rust/ql/lib/codeql/rust/dataflow/internal/TaintTrackingImpl.qll Implements taint propagation through read operations with primitive type filtering
rust/ql/lib/codeql/rust/frameworks/actix-web.model.yml Removes redundant field-specific taint models that are now handled automatically
rust/ql/test/library-tests/dataflow/sources/web_frameworks/test.rs Updates test annotations to reflect newly detected taint flows in web framework handlers
rust/ql/test/library-tests/dataflow/sources/web_frameworks/InlineFlow.expected Updates expected flow analysis results with new taint propagation edges
rust/ql/test/library-tests/dataflow/sources/net/test.rs Updates network test annotations for newly detected flows
rust/ql/test/library-tests/dataflow/sources/net/InlineFlow.expected Updates expected network flow results
rust/ql/test/library-tests/dataflow/sources/file/InlineFlow.expected Updates expected file I/O flow results
rust/ql/test/library-tests/dataflow/sources/env/test.rs Updates environment variable test annotations
rust/ql/test/library-tests/dataflow/sources/env/InlineFlow.expected Updates expected environment flow results
rust/ql/test/library-tests/dataflow/sources/database/test.rs Updates database test annotations
rust/ql/test/library-tests/dataflow/sources/database/InlineFlow.expected Updates expected database flow results
rust/ql/test/query-tests/security/CWE-825/AccessAfterLifetime.expected Updates lifetime analysis test expectations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance doesn't look good on rust.

@geoffw0
Copy link
Contributor

geoffw0 commented Nov 25, 2025

We limit this to not apply if the type of the operation is a small primitive type as these are often uninteresting (for instance in the case of an injection query).

Is that because we get unwanted results, or because we get better performance without considering these?

@paldepind
Copy link
Contributor Author

Is that because we get unwanted results, or because we get better performance without considering these?

The idea is that it should achieve both and be appropriate for most queries.

We want to avoid cases where some value foo of type Foo { size: usize, dangerus_content: T } is tainted, which leads to foo.size being tainted, and we end up producing results in queries where it makes no sense.

I think this can always be tweaked on a per query basis by using isAdditionalFlowStep and barriers. So it's actually more about finding a sensible default for most queries.

@geoffw0
Copy link
Contributor

geoffw0 commented Nov 26, 2025

I think this can always be tweaked on a per query basis by using isAdditionalFlowStep and barriers.

We might want to add barriers to the two pointer queries, perhaps excluding flow through non-pointer/reference types or something to that effect. But I haven't finished looking into the result changes yet.

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its great that we're getting a good number of new results here in both the tests and DCA projects. Test changes look fantastic. I've only analyzed a handful of the new DCA results (because its quite time consuming to do so in detail), but I didn't spot any issues with the new flow from this PR in those either. I would appreciate if someone would look through a few more of them.

QL changes LGTM aside from a few nits (below).

How is performance now?

Definitely needs a change note.

Comment on lines +15 to +17
* Holds if the field `field` should, by default, be excluded from taint steps.
* The syntax used to denote the field is the same as for `Field` in
* models-as-data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be slightly more specific about what we mean here, perhaps something like:

Suggested change
* Holds if the field `field` should, by default, be excluded from taint steps.
* The syntax used to denote the field is the same as for `Field` in
* models-as-data.
* Holds if the field `field` should, by default, be excluded from taint steps
* from the containing type to reads of the field. The models-as-data syntax
* used to denote the field is the same as for `Field[]` access path elements.

// is tainted and an operation reads from `foo` (e.g., `foo.bar`) then
// taint is propagated. We limit this to not apply if the type of the
// operation is a small primitive type as these are often uninteresting
// (for instance in the case of an injection query).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not completely convinced we should apply this limitation here, rather than adding barriers for these types to all injection queries. I haven't seen the kinds of results you're talking about though, other than in the pseudocode example you gave on this PR discussion. Perhaps we could plan to look into this as follow-up, since doing this would mostly add even more flow to what we're already doing here.

s instanceof Builtins::Bool or
s instanceof Builtins::Char
)
) and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The forex means that we will only get a result if type inference succeeds on succ. Is this on purpose?

)
) and
not excludedTaintStepContent(c) and
not TypeInference::inferType(succ.asExpr()).(Type::EnumType).getEnum().isFieldless()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could move this line into the forex and reduce it to:

and
not t.(Type::EnumType).getEnum().isFieldless()

)
or
// Let all read steps (including those from flow summaries and those that
// result in small primitive types) give rise to taint steps.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've lost some context from the original comment here, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants