Skip to content

Python: Add Pandas SQLi sinks #19594

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

sylwia-budzynska
Copy link
Contributor

@sylwia-budzynska sylwia-budzynska commented May 27, 2025

Pandas has a read_sql sink, which allows for executing raw SQL queries with untrusted input. read_sql is a convenience method, that dispatches the query either to read_sql_table or read_sql_query. Please note that read_sql_table is not vulnerable to SQLi, but read_sql_query is.

Note that instead of adding new tests, I edited the previous tests, because they contained the vulnerable read_sql and read_sql_query method, which made them show up as failed.

@Copilot Copilot AI review requested due to automatic review settings May 27, 2025 11:38
@sylwia-budzynska sylwia-budzynska requested a review from a team as a code owner May 27, 2025 11:38
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces explicit SQL injection modeling for pandas.read_sql and pandas.read_sql_query, updates existing tests to use an sqlite3 connection for sink detection, and adds a change note entry.

  • Updated pandas DataFrame test to connect via sqlite3 and mark SQL sinks with $getSql.
  • Added a ReadSQLCall QL class to flag raw SQL queries in Pandas.
  • Documented the new analysis in the change notes.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
python/ql/test/library-tests/frameworks/pandas/dataframe_query.py Replaced path-based calls with sqlite3.connect, updated sink markers
python/ql/src/change-notes/2025-05-26-pandas-sqli-sinks.md Added entry for Pandas SQLi sinks
python/ql/lib/semmle/python/frameworks/Pandas.qll Introduced ReadSQLCall to model read_sql and read_sql_query as sinks
Comments suppressed due to low confidence (2)

python/ql/test/library-tests/frameworks/pandas/dataframe_query.py:59

  • This test covers positional arguments but does not cover keyword argument usage (e.g., sql= and con=). Adding a test for pd.read_sql_query(sql="...", con=connection) would ensure named-parameter sinks are detected.
df = pd.read_sql_query("sql query", connection) # $getSql="sql query"

python/ql/lib/semmle/python/frameworks/Pandas.qll:161

  • Unconditionally treating read_sql as a sink may flag safe read_sql_table calls (which read_sql can dispatch to). Consider refining the model to only treat calls that route to read_sql_query or inspect argument patterns to avoid false positives.
ReadSQLCall() { this = API::moduleImport("pandas").getMember(["read_sql", "read_sql_query"]).getACall() }

Comment on lines 58 to 63
connection = sqlite3.connect("pets.db")
df = pd.read_sql_query("sql query", connection) # $getSql="sql query"
df.query("query") # $getCode="query"
df.eval("query") # $getCode="query"

df = pd.read_sql("filepath", 'postgres:///db_name')
connection = sqlite3.connect("pets.db")
Copy link
Preview

Copilot AI May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The test file creates the same connection twice. Consider moving connection = sqlite3.connect("pets.db") to a shared setup block or top-level fixture to avoid duplication.

Copilot uses AI. Check for mistakes.

tausbn
tausbn previously approved these changes May 27, 2025
Copy link
Contributor

@tausbn tausbn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short and sweet! Looks good to me. 👍

@sylwia-budzynska
Copy link
Contributor Author

Thanks @tausbn ! I fixed some nitpicks from automation :)

@sylwia-budzynska sylwia-budzynska requested a review from tausbn May 27, 2025 11:48
Copy link
Contributor

@tausbn tausbn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the additional fixes! 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants