Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Modernize py/mixed-tuple-returns #19136

Merged
merged 5 commits into from
Apr 1, 2025

Conversation

tausbn
Copy link
Contributor

@tausbn tausbn commented Mar 27, 2025

Removes the dependence on points-to in favour of an approach based on (local) data-flow.

I first tried a version that used type tracking, as this more accurately mimics the behaviour of the old query. However, I soon discovered that there were many false positives in this setup. The main bad pattern I saw was a helper function somewhere deep inside the code that both receives and returns an argument that can be tuples with different sizes and origins. In this case, global flow produces something akin to a cartesian product of "n-tuples that flow into the function" and "m-tuples that flow into the function" where m < n.

To combat this, I decided to instead focus on only flow within a given function (and so local data-flow was sufficient).

Additionally, another class of false positives I saw was cases where the return type actually witnessed that the function in question could return tuples of varying sizes. In this case it seems reasonable to not flag these instances, since they are already (presumably) being checked by a type checker.

More generally, if you've annotated the return type of the function with anything (not just Tuple[...]), then there's probably little need to flag it.

tausbn added 3 commits March 27, 2025 15:27
Removes the dependence on points-to in favour of an approach based on
(local) data-flow.

I first tried a version that used type tracking, as this more accurately
mimics the behaviour of the old query. However, I soon discovered that
there were _many_ false positives in this setup. The main bad pattern I
saw was a helper function somewhere deep inside the code that both
receives and returns an argument that can be tuples with different sizes
and origins. In this case, global flow produces something akin to a
cartesian product of "n-tuples that flow into the function" and
"m-tuples that flow into the function" where m < n.

To combat this, I decided to instead focus on only flow _within_ a given
function (and so local data-flow was sufficient).

Additionally, another class of false positives I saw was cases where the
return type actually witnessed that the function in question could
return tuples of varying sizes. In this case it seems reasonable to not
flag these instances, since they are already (presumably) being checked
by a type checker.

More generally, if you've annotated the return type of the function with
anything (not just `Tuple[...]`), then there's probably little need to
flag it.
As we're no longer tracking tuples across function boundaries, we lose
the result that related to this setup (which, as the preceding commit
explains, lead to a lot of false positives).
@tausbn tausbn marked this pull request as ready for review March 27, 2025 22:20
@Copilot Copilot bot review requested due to automatic review settings March 27, 2025 22:20
@tausbn tausbn requested a review from a team as a code owner March 27, 2025 22:20
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR modernizes the py/mixed-tuple-returns query by removing the dependency on points-to analysis and shifting to a local data-flow approach to reduce false positives.

  • Removed false positives by no longer flagging tuples passed as function arguments.
  • Enhanced handling of functions with annotated return types to avoid unnecessary warnings.
Files not reviewed (2)
  • python/ql/src/Functions/ReturnConsistentTupleSizes.ql: Language not supported
  • python/ql/test/query-tests/Functions/return_values/ReturnConsistentTupleSizes.expected: Language not supported

Tip: Leave feedback on Copilot's review comments with the 👎 and 👍 buttons to help improve review quality. Learn more

joefarebrother
joefarebrother previously approved these changes Mar 28, 2025
Copy link
Contributor

@joefarebrother joefarebrother left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍
Just one minor comment.

@@ -1,2 +1 @@
| functions_test.py:306:1:306:39 | Function returning_different_tuple_sizes | returning_different_tuple_sizes returns $@ and $@. | functions_test.py:308:16:308:18 | Tuple | tuple of size 2 | functions_test.py:310:16:310:20 | Tuple | tuple of size 3 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might like to put some comments in the tests explaining the expected results.
(in particular that this case no longer gives an alert)

Copy link
Contributor Author

@tausbn tausbn Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I have done so in 6674288.

Adds a comment explaining why we no longer flag the indirect tuple
example.
Also adds a test case which _would_ be flagged if not for the type
annotation.
@tausbn tausbn merged commit aacdc70 into main Apr 1, 2025
15 checks passed
@tausbn tausbn deleted the tausbn/python-modernise-mixed-tuple-returns-query branch April 1, 2025 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants