-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
Community CollaborationsCollaborating with Organizations/InstitutionsCollaborating with Organizations/InstitutionsCuration for Harvard CollectionFY26 RoadmapSize: 80A percentage of a sprint.A percentage of a sprint.
Description
Objective:
- 300K datasets rescued from data.gov by Harvard Innovation Lab, to move into a collection on HDV
Requires a Harvard faculty owner
Data details:
Our data is here, with a readme — let me know if there are parts of that we could expand:
https://source.coop/harvard-lil/gov-data
The metadata.csv.zip and metadata.jsonl.zip files have metadata about all of the datasets we collected.
We also now have a statically-hosted browser for the data, described here:
https://lil.law.harvard.edu/blog/2025/10/10/welcome-to-lil-s-data-gov-archive-search/
I’m adding our developer Chris Setzer as well — if there were other data formats that would be helpful, we’d be happy to consider.
I think we will have a great deal of CDC data, but you would have to analyze the metadata files or use the hosted browser to check what’s in there.
Thanks,
Jack
Contact:
- Cushman, Jack [email protected]
Metadata
Metadata
Assignees
Labels
Community CollaborationsCollaborating with Organizations/InstitutionsCollaborating with Organizations/InstitutionsCuration for Harvard CollectionFY26 RoadmapSize: 80A percentage of a sprint.A percentage of a sprint.
Type
Projects
Status
SPRINT- NEEDS SIZING