-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(RFC): Adds altair.datasets
#3631
base: main
Are you sure you want to change the base?
Conversation
- Allow quickly switching between version tags #3150 (comment)
To support [flights-200k.arrow](https://github.com/vega/vega-datasets/blob/f637f85f6a16f4b551b9e2eb669599cc21d77e69/data/flights-200k.arrow)
Not required for these requests, but may be helpful to avoid limits
As an example, for comparing against the most recent I've added the 5 most recent
- Basic mechanism for discovering new versions - Tries to minimise number of and total size of requests
Experimenting with querying the url cache w/ expressions
- `metadata_full.parquet` stores **all known** file metadata - `GitHub.refresh()` to maintain integrity in a safe manner - Roughly 3000 rows - Single release: **9kb** vs 46 releases: **21kb**
- Still undecided exactly how this functionality should work - Need to resolve `npm` tags != `gh` tags issue as well
- Will be even more useful after merging vega/vega-datasets#663 - Thinking this is a fair tradeoff vs inlining the descriptions into `altair` - All the info is available and it is quicker than manually searching the headings in a browser
Contains a fix (narwhals-dev/narwhals#1934) for #3631 (comment)
Possible since narwhals-dev/narwhals#1930 @MarcoGorelli if you're interested what that PR did (besides fix warnings 😉)
Made possible via vega/vega-datasets#681 - Removes temp files - Removes some outdated apis - Remove test based on removed `"points"` dataset
I’ve added an item to the tracking list in OP. I raised this suggestion during the review as an important improvement to the user interface. I’d be grateful if it could be addressed before merging and not is overlooked. Almost there!🙌 |
Thanks @mattijn I actually have that one bookmarked, rest assured I have not forgotten! This was what I was thinking of doing, while we wait on narwhals-dev/narwhals#1924 and vega/vega-datasets#654:
FYI, I have tried responding to (#3631 (comment)) twice but both times I ended up writing waaaaay too much. Note (to self) stashed an experiment w/ this under the name |
Thanks! Appreciate you having this on your list. Great that we’re aligned. |
No worries @mattijn, sorry for not communicating any of this sooner. |
Will unblock (#3631 (comment))
Related
Tracking
Waiting on the next
vega-datasets
release.Once there is a stabledatapackage.json
available - there is quite a lot oftools/datasets
that can be simplified/removed.3.0.0
Release vega-datasets#654[email protected]
Discovered a bug that makes some handling of expressions a little less efficient.
Iterator[IntoExpr]
narwhals-dev/narwhals#1897Upstreaming some
nw.Schema
stuff tonarwhals
nw.(DType|Schema)
conversion API narwhals-dev/narwhals#1912altair.datasets
#3631 (comment)Improve user-facing interface
altair.datasets
#3631 (comment)Description
Providing a minimal, but up-to-date source for https://github.com/vega/vega-datasets.
This PR takes a different approach to that of https://github.com/altair-viz/vega_datasets, notably:
vega-datasets/datapackage.json
pandas
"polars"
backend, the slowest I've had on a cache-hit is 0.1s to loadExamples
These all come from the docstrings of:
Loader
Loader.from_backend
Loader.__call__