-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug with tables aliases #19
Comments
Hi, I just write a table alias expander with import sqlglot
from sqlglot import exp
def expand_alias(expr: exp.Expression) -> exp.Expression:
for iue in expr.find_all(exp.Union): # match all INTERSECT, UNION, EXCEPT
assert isinstance(iue.left, exp.Expression) and isinstance(iue.right, exp.Expression)
iue.set("left", expand_alias(iue.left))
iue.set("right", expand_alias(iue.right))
alias_to_table: dict[str, str] = {}
for table in expr.find_all(exp.Table):
if not table.alias:
continue
alias = table.alias.lower()
tablename = table.name.lower()
alias_to_table[alias] = tablename
table.set("this", tablename)
table.set("alias", None)
for column in expr.find_all(exp.Column):
if column.table:
column_tablename = column.table.lower()
if column_tablename in alias_to_table:
actual_table = alias_to_table[column_tablename]
column.set("table", actual_table)
return expr
sql = """
select T1.aaa from table1 as T1 join table2 as T2 on T1.aaa = T2.aaa
intersect
select T1.aaa from table1 as T1 join table3 as T2 on T1.aaa = T2.aaa
"""
expr = sqlglot.parse_one(sql)
refined_sql = expand_alias(expr).sql()
print(refined_sql)
# OUTPUT:
# SELECT table1.aaa FROM table1 JOIN table2 ON table1.aaa = table2.aaa
# INTERSECT
# SELECT table1.aaa FROM table1 JOIN table3 ON table1.aaa = table3.aaa NOTES:
It also supports table aliases without Maybe someone should refactor this repo with sqlglot for better readability and extensibility... |
Thanks for the nice piece of code! It seems to be the easiest solution. |
Currently creating mapping between aliases and actual tables names is performed globally for whole SQL, what is not correct for more complicated once.
Lets consider following SQL taken directly from SPIDER (there are more queries like this):
First subquery maps
T2
toAsset_Parts
, but second subquery mapsT2
toFault_Log
. Current evaluation script is not able to handle this. It may allow false positives to pass and throw exceptions because script is searching for column in wrong table.Relevant code fragment:
test-suite-sql-eval/process_sql.py
Lines 150 to 156 in e97acc5
The text was updated successfully, but these errors were encountered: