Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select clause after append clause #2680

Open
2 tasks done
YuxiLing opened this issue Jun 1, 2023 · 3 comments
Open
2 tasks done

select clause after append clause #2680

YuxiLing opened this issue Jun 1, 2023 · 3 comments
Labels
bug Invalid compiler output or panic compiler

Comments

@YuxiLing
Copy link

YuxiLing commented Jun 1, 2023

What happened?

Dear developer,

Hope this would be helpful.

prql version: 0.8.1

comments: The select clause affects the translation of append clause. It might be caused by internal optimization in prql compiler.

In the following example, we assume tb1 and tb2 have the same columns. We append tb2 to tb1, and select two new columns. However, inside generated sql query, it selects new columns first. Then the union operation fails.

error message from the database: "each UNION query must have the same number of columns"

PRQL input

from tb1
append tb2
select [new_col1=c1*2,new_col2=c2/c3]

SQL output

WITH table_1 AS (
  SELECT
    c2,
    c3,
    c1
  FROM
    tb1
  UNION
  ALL
  SELECT
    *
  FROM
    tb2
)
SELECT
  c1 * 2 AS new_col1,
  c2 / c3 AS new_col2
FROM
  table_1 AS table_0

Expected SQL output

WITH table_1 AS (
  SELECT
    *
  FROM
    tb1
  UNION
  ALL
  SELECT
    *
  FROM
    tb2
)
SELECT
  c1 * 2 AS new_col1,
  c2 / c3 AS new_col2
FROM
  table_1 AS table_0

MVCE confirmation

  • Minimal example
  • New issue

Anything else?

No response

@YuxiLing YuxiLing added the bug Invalid compiler output or panic label Jun 1, 2023
@aljazerzen
Copy link
Member

Yup, that's a bug.

When the compiler encounters c1, c2 and c3 it infers that they are from tb1, while is should find both tb1 and tb2.

@aljazerzen
Copy link
Member

aljazerzen commented Jun 1, 2023

I would expect SQL to be:

WITH table_1 AS (
  SELECT
    c1, c2, c3
  FROM
    tb1
  UNION
  ALL
  SELECT
    c1, c2, c3
  FROM
    tb2
)
SELECT
  c1 * 2 AS new_col1,
  c2 / c3 AS new_col2
FROM
  table_1 AS table_0

@kgutwin
Copy link
Collaborator

kgutwin commented Feb 27, 2025

Rewriting the original input to the new syntax:

from tb1
append tb2
select {new_col1=c1*2,new_col2=c2/c3}

Still produces the wrong output as of 0.13.3:

WITH table_0 AS (
  SELECT
    c2,
    c3,
    c1
  FROM
    tb1
  UNION
  ALL
  SELECT
    *
  FROM
    tb2
)
SELECT
  c1 * 2 AS new_col1,
  c2 / c3 AS new_col2
FROM
  table_0

-- Generated by PRQL compiler version:0.13.3-39-ge393ab4d (https://prql-lang.org)

Importantly, an attempted workaround to explicitly select columns from tb1 and tb2 causes another problem:

from tb1
select {c1, c2, c3}
append (from tb2 | select {c1, c2, c3})
select {new_col1=c1*2,new_col2=c2/c3}
WITH table_0 AS (
  SELECT
    c1,
    c2,
    c3
  FROM
    tb2
),
table_1 AS (
  SELECT
    c2,
    c3,
    c1
  FROM
    tb1
  UNION
  ALL
  SELECT
    *
  FROM
    table_0
)
SELECT
  c1 * 2 AS new_col1,
  c2 / c3 AS new_col2
FROM
  table_1

-- Generated by PRQL compiler version:0.13.3-39-ge393ab4d (https://prql-lang.org)

While the number of columns from tb1 and tb2 are now the same, the SELECT order between the two tables is misaligned, which would cause a surprising mismatch, see #4724 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Invalid compiler output or panic compiler
Projects
None yet
Development

No branches or pull requests

3 participants