-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pupa clean CLI command #344
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great, @antidipyramid! Excited to bring this over the line. Left a few things inline.
pupa/cli/commands/clean.py
Outdated
results = [] | ||
for model in models: | ||
# Jurisdictions are protected from deletion | ||
if "Jurisdiction" not in model.__name__: | ||
cutoff_date = datetime.now(tz=timezone.utc) - timedelta(days=window) | ||
results.append(model.objects.filter(last_seen__lte=cutoff_date)) | ||
|
||
return itertools.chain(*results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about making this a generator (yielding from results) instead of building up a list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to go generator when I know I just need to iterate over the results and a list could grow to be quite large, hogging a lot of memory.
pupa/cli/commands/clean.py
Outdated
def remove_stale_objects(window): | ||
""" | ||
Remove all database objects that haven't seen been in {window} days. | ||
""" | ||
for obj in get_stale_objects(window): | ||
print(f"Deleting {obj}...") | ||
obj.delete() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about making this and reporting functionality helper methods on your command class, since they're directly related to command behavior?
pupa/cli/commands/clean.py
Outdated
) | ||
resp = input() | ||
if resp != "Y": | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like to use sys.exit() so it's explicit what's going on. You can pass a 0 exit code if you want a No to still be considered a successful run.
pupa/cli/commands/clean.py
Outdated
print( | ||
"This will permanently delete all objects from your database" | ||
f" that have not been scraped within the last {args.window}" | ||
" days. Are you sure? (Y/N)" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to report the specific objects (or a summary?) in this prompt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we already have a report option, would a count of objects to be deleted be enough?
pupa/cli/commands/clean.py
Outdated
return result | ||
|
||
|
||
def get_stale_objects(window): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding this to the command class, as well?
@@ -0,0 +1,51 @@ | |||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on adding an integration test that sets up the data and runs the pupa clean
command, testing for the desired outcomes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a good idea. Do you mean just calling Command.handle()
and testing the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd actually test that the command removes the data I expect and doesn't touch anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small thing inline, then this can come in. This is really cool!
Fix clean warning prompt Co-authored-by: hannah cushman garland <[email protected]>
Overview
This PR adds a new pupa CLI command:
pupa clean
:Testing Instructions
To test with a live database, you should use a local instance of opencivicdata/scrapers-us-municipal:
git
by addingsudo apt-get install git
to theDockerfile
cp -r /path/to/pupa .
and runningdocker-compose run --rm scrapers pip install -e pupa
docker-compose run --rm app pupa clean