Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine which outputs need to be stored, where, and how #22

Closed
not-the-fish opened this issue Sep 28, 2017 · 2 comments
Closed

Determine which outputs need to be stored, where, and how #22

not-the-fish opened this issue Sep 28, 2017 · 2 comments

Comments

@not-the-fish
Copy link
Contributor

pgdedupe output is not documented, so we will need to do that (dssg/pgdedupe#64) to determine which outputs are critical for reproducibility, ingesting updated data (#14), and model evaluation.

@not-the-fish
Copy link
Contributor Author

See also dssg/pgdedupe#66 for cleanup step in pgdedupe. Tables that need to be kept will need to be prefixed/postfixed with model hash before cleanup to be cached.

@thcrock
Copy link
Contributor

thcrock commented Apr 19, 2018

Not using pgdedupe

@thcrock thcrock closed this as completed Apr 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants