Skip to content

Ibis benchmarking: DuckDB, DataFusion, Polars – Ibis #10179

Answered by lostmygithubaccount
giscus[bot] bot asked this question in Q&A
Discussion options

You must be logged in to vote

hi @alberttwong, the results of the TPC-H queries are written out to Parquet files and discarded (to ensure the results are materialized uniformly), but this does not contain the runtimes

the runtimes are stored as JSON and compacted into Parquet files, then uploaded to a public GCS bucket so you can perform you own analysis. they results are a bit old at this point and I plan on improving the benchmarking (e.g. capturing memory usage) and running on newer versions soon

it's also not necessarily straightforward to compare each system, as sometimes queries fail on one but not others. some are better at some scale factors. in general, Polars is the best when data size is small relative to R…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@lostmygithubaccount
Comment options

Answer selected by cpcloud
@lostmygithubaccount
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants