Skip to content

feat: vendor datafusion-cli helpers for parity#28

Merged
lidavidm merged 5 commits into
mainfrom
cli
Jun 5, 2026
Merged

feat: vendor datafusion-cli helpers for parity#28
lidavidm merged 5 commits into
mainfrom
cli

Conversation

@lidavidm
Copy link
Copy Markdown
Contributor

@lidavidm lidavidm commented Jun 4, 2026

Vendor code from datafusion-cli that implements various conveniences like being able to select directly from a URI. Also do a bit of internal code cleanup to reduce duplication (I noticed the LLM kept getting confused as heck as it kept only adding things to one path instead of all of them).

Assisted-by: Claude Opus 4.6 noreply@anthropic.com

@lidavidm lidavidm marked this pull request as ready for review June 4, 2026 04:26
@lidavidm lidavidm requested a review from amoeba June 4, 2026 04:27
@ianmcook
Copy link
Copy Markdown

ianmcook commented Jun 4, 2026

I downloaded the .dylib from the CI artifacts, installed it locally, and ran a variety of the queries from the DataFusion CLI user guide. Everything seemed to work great.

@ianmcook
Copy link
Copy Markdown

ianmcook commented Jun 4, 2026

In the docs, how about linking to the DataFusion CLI docs, and using one of the examples there, e.g.

SELECT COUNT(*) FROM 'https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet';

@ianmcook
Copy link
Copy Markdown

ianmcook commented Jun 4, 2026

Or here's a fun publicly hosted one from https://github.com/hyparam/hyparquet:

SELECT `Breed Name`, `Lifespan`
  FROM 'https://hyperparam-public.s3.amazonaws.com/bunnies.parquet'
  ORDER BY `Lifespan` DESC
  LIMIT 5;

@lidavidm lidavidm merged commit c386d33 into main Jun 5, 2026
15 checks passed
@lidavidm lidavidm deleted the cli branch June 5, 2026 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants