Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: Nevada data scraper #108

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

FEAT: Nevada data scraper #108

wants to merge 11 commits into from

Conversation

tlyon3
Copy link
Collaborator

@tlyon3 tlyon3 commented Aug 7, 2020

No description provided.

@tlyon3 tlyon3 changed the title WIP: Nevada data scraper FEAT: Nevada data scraper Aug 13, 2020
@tlyon3 tlyon3 requested review from sglyon and cc7768 August 13, 2020 14:59
variable_name="hospital_beds_in_use_covid_confirmed"
)

df = pd.concat([suspected, confirmed], sort=False, ignore_index=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're collecting suspected and confirmed then we should also make sure that we collect total.

renamed = df.rename(columns={"Date": "dt", "Cases": "cases_total"})
renamed.dt = pd.to_datetime(renamed.dt)
return renamed.melt(id_vars=["dt"], var_name="variable_name").assign(
vintage=pd.Timestamp.utcnow(), fips=self.state_fips
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the _retrieve_vintage method so that, if we changed how we collected vintages, then it would be a change to one method rather than hunting down every place we used it.

renamed = df.rename(columns={"Date": "dt", "Cumulative": "tests_total"})
renamed.dt = pd.to_datetime(renamed.dt)
return renamed.melt(id_vars=["dt"], var_name="variable_name").assign(
vintage=pd.Timestamp.utcnow(), fips=self.state_fips
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use _retrieve_vintage method. See below

.melt(id_vars=["county"], var_name="variable_name")
.assign(
vintage=pd.Timestamp.utcnow(),
dt=pd.Timestamp.utcnow()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the _retrieve_dt method rather than determine time by hand

renamed[["county", "tests_total", "cases_total", "deaths_total"]]
.melt(id_vars=["county"], var_name="variable_name")
.assign(
vintage=pd.Timestamp.utcnow(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_retrieve_vintage method

renamed = out.rename(
columns={
"County": "county",
"Tests": "tests_total",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we collect the "People Tested" column rather than the "Tests" column -- We prefer to report the number of people tested rather than the number of tests administered.

@cc7768 cc7768 requested a review from sglyon August 17, 2020 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants