-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Modernize py/mixed-tuple-returns
#19136
Python: Modernize py/mixed-tuple-returns
#19136
Conversation
Removes the dependence on points-to in favour of an approach based on (local) data-flow. I first tried a version that used type tracking, as this more accurately mimics the behaviour of the old query. However, I soon discovered that there were _many_ false positives in this setup. The main bad pattern I saw was a helper function somewhere deep inside the code that both receives and returns an argument that can be tuples with different sizes and origins. In this case, global flow produces something akin to a cartesian product of "n-tuples that flow into the function" and "m-tuples that flow into the function" where m < n. To combat this, I decided to instead focus on only flow _within_ a given function (and so local data-flow was sufficient). Additionally, another class of false positives I saw was cases where the return type actually witnessed that the function in question could return tuples of varying sizes. In this case it seems reasonable to not flag these instances, since they are already (presumably) being checked by a type checker. More generally, if you've annotated the return type of the function with anything (not just `Tuple[...]`), then there's probably little need to flag it.
As we're no longer tracking tuples across function boundaries, we lose the result that related to this setup (which, as the preceding commit explains, lead to a lot of false positives).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR modernizes the py/mixed-tuple-returns query by removing the dependency on points-to analysis and shifting to a local data-flow approach to reduce false positives.
- Removed false positives by no longer flagging tuples passed as function arguments.
- Enhanced handling of functions with annotated return types to avoid unnecessary warnings.
Files not reviewed (2)
- python/ql/src/Functions/ReturnConsistentTupleSizes.ql: Language not supported
- python/ql/test/query-tests/Functions/return_values/ReturnConsistentTupleSizes.expected: Language not supported
Tip: Leave feedback on Copilot's review comments with the 👎 and 👍 buttons to help improve review quality. Learn more
python/ql/src/change-notes/2025-03-27-modernize-mixed-tuple-returns-query.md
Outdated
Show resolved
Hide resolved
python/ql/src/change-notes/2025-03-27-modernize-mixed-tuple-returns-query.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
Just one minor comment.
@@ -1,2 +1 @@ | |||
| functions_test.py:306:1:306:39 | Function returning_different_tuple_sizes | returning_different_tuple_sizes returns $@ and $@. | functions_test.py:308:16:308:18 | Tuple | tuple of size 2 | functions_test.py:310:16:310:20 | Tuple | tuple of size 3 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might like to put some comments in the tests explaining the expected results.
(in particular that this case no longer gives an alert)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I have done so in 6674288.
Adds a comment explaining why we no longer flag the indirect tuple example. Also adds a test case which _would_ be flagged if not for the type annotation.
Removes the dependence on points-to in favour of an approach based on (local) data-flow.
I first tried a version that used type tracking, as this more accurately mimics the behaviour of the old query. However, I soon discovered that there were many false positives in this setup. The main bad pattern I saw was a helper function somewhere deep inside the code that both receives and returns an argument that can be tuples with different sizes and origins. In this case, global flow produces something akin to a cartesian product of "n-tuples that flow into the function" and "m-tuples that flow into the function" where m < n.
To combat this, I decided to instead focus on only flow within a given function (and so local data-flow was sufficient).
Additionally, another class of false positives I saw was cases where the return type actually witnessed that the function in question could return tuples of varying sizes. In this case it seems reasonable to not flag these instances, since they are already (presumably) being checked by a type checker.
More generally, if you've annotated the return type of the function with anything (not just
Tuple[...]
), then there's probably little need to flag it.