Skip to content

Add clickbench SQL benchmark#22633

Open
Omega359 wants to merge 2 commits into
apache:mainfrom
Omega359:sql-benchmarks/clickbench
Open

Add clickbench SQL benchmark#22633
Omega359 wants to merge 2 commits into
apache:mainfrom
Omega359:sql-benchmarks/clickbench

Conversation

@Omega359
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Part of #21706

Rationale for this change

Continue work on sql benchmark migration.

What changes are included in this PR?

Clickbench sql benchmark.

Are these changes tested?

Yes

BENCH_NAME=clickbench CLICKBENCH_TYPE=single cargo bench --bench sql
BENCH_NAME=clickbench CLICKBENCH_TYPE=partitioned cargo bench --bench sql

Are there any user-facing changes?

no

@Omega359 Omega359 marked this pull request as ready for review May 30, 2026 00:37
@adriangb adriangb requested a review from Copilot June 2, 2026 01:29
@@ -0,0 +1,3 @@
CREATE EXTERNAL TABLE hits_raw STORED AS PARQUET LOCATION 'data/hits.parquet';

CREATE VIEW hits AS SELECT * EXCEPT ("EventDate"), CAST(CAST("EventDate" AS INTEGER) AS DATE) AS "EventDate" FROM hits_raw No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed this matches the existing view 👍🏻

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the ClickBench suite to the SQL-benchmark framework (part of the ongoing migration in #21706), enabling running the ClickBench queries via cargo bench --bench sql with either a single-file or partitioned dataset.

Changes:

  • Added ClickBench initialization SQL to configure parquet options and load either single-file or partitioned hits datasets.
  • Added ClickBench benchmark definitions (Q00–Q42) in .benchmark format, including a basic “data loaded” assertion and result output paths.

Reviewed changes

Copilot reviewed 46 out of 46 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
benchmarks/sql_benchmarks/clickbench/init/set_config.sql Sets required DataFusion config for reading ClickBench parquet data and toggling filter pushdown behavior.
benchmarks/sql_benchmarks/clickbench/init/load-single.sql Loads single-file ClickBench dataset and defines hits view.
benchmarks/sql_benchmarks/clickbench/init/load-partitioned.sql Loads partitioned ClickBench dataset and defines hits view.
benchmarks/sql_benchmarks/clickbench/benchmarks/q00.benchmark Adds ClickBench Q00 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q01.benchmark Adds ClickBench Q01 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q02.benchmark Adds ClickBench Q02 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q03.benchmark Adds ClickBench Q03 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q04.benchmark Adds ClickBench Q04 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q05.benchmark Adds ClickBench Q05 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q06.benchmark Adds ClickBench Q06 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q07.benchmark Adds ClickBench Q07 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q08.benchmark Adds ClickBench Q08 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q09.benchmark Adds ClickBench Q09 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q10.benchmark Adds ClickBench Q10 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q11.benchmark Adds ClickBench Q11 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q12.benchmark Adds ClickBench Q12 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q13.benchmark Adds ClickBench Q13 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q14.benchmark Adds ClickBench Q14 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q15.benchmark Adds ClickBench Q15 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q16.benchmark Adds ClickBench Q16 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q17.benchmark Adds ClickBench Q17 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q18.benchmark Adds ClickBench Q18 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q19.benchmark Adds ClickBench Q19 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q20.benchmark Adds ClickBench Q20 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q21.benchmark Adds ClickBench Q21 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q22.benchmark Adds ClickBench Q22 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q23.benchmark Adds ClickBench Q23 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q24.benchmark Adds ClickBench Q24 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q25.benchmark Adds ClickBench Q25 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q26.benchmark Adds ClickBench Q26 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q27.benchmark Adds ClickBench Q27 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q28.benchmark Adds ClickBench Q28 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q29.benchmark Adds ClickBench Q29 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q30.benchmark Adds ClickBench Q30 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q31.benchmark Adds ClickBench Q31 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q32.benchmark Adds ClickBench Q32 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q33.benchmark Adds ClickBench Q33 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q34.benchmark Adds ClickBench Q34 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q35.benchmark Adds ClickBench Q35 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q36.benchmark Adds ClickBench Q36 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q37.benchmark Adds ClickBench Q37 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q38.benchmark Adds ClickBench Q38 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q39.benchmark Adds ClickBench Q39 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q40.benchmark Adds ClickBench Q40 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q41.benchmark Adds ClickBench Q41 benchmark definition.
benchmarks/sql_benchmarks/clickbench/benchmarks/q42.benchmark Adds ClickBench Q42 benchmark definition.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread benchmarks/sql_benchmarks/clickbench/init/set_config.sql Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@adriangb adriangb enabled auto-merge June 2, 2026 01:35
@adriangb adriangb disabled auto-merge June 2, 2026 01:37
Copy link
Copy Markdown
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Omega359 upon further review I found a couple minor points worth clarifying / addressing.

@@ -0,0 +1,3 @@
CREATE EXTERNAL TABLE hits_raw STORED AS PARQUET LOCATION 'data/hits_partitioned/';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #22660 (comment) also applies here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CREATE EXTERNAL TABLE hits_raw STORED AS PARQUET LOCATION 'data/hits_partitioned/';
CREATE EXTERNAL TABLE hits_raw STORED AS PARQUET LOCATION '${DATA_DIR:-data}/hits_partitioned/';

@@ -0,0 +1,3 @@
CREATE EXTERNAL TABLE hits_raw STORED AS PARQUET LOCATION 'data/hits.parquet';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CREATE EXTERNAL TABLE hits_raw STORED AS PARQUET LOCATION 'data/hits.parquet';
CREATE EXTERNAL TABLE hits_raw STORED AS PARQUET LOCATION '${DATA_DIR:-data}/hits.parquet';

Comment on lines +6 to +7
SET datafusion.execution.parquet.pushdown_filters = ${PUSHDOWN_FILTERS:-false};
SET datafusion.execution.parquet.reorder_filters = ${REORDER_FILTERS:-false}; No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these needed here? Is it just to match existing env vars supported by the binary? Because these can already be overridden via env vars.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they're not really needed. They were there from the initial work I did writing these tests to match what was in the clickbench.rs (triggered from the sorted_by option).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants