Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
data/*.csv
__pycache__
__pycache__
experiments/.ipynb_checkpoints
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ This repo contains the data analysis and subsequent reports done by the data sci


## Getting Up and Running With the Pipeline
- Sign up for a [Google Cloud account](https://cloud.google.com/gcp?utm_source=google&utm_medium=cpc&utm_campaign=na-US-all-en-dr-bkws-all-all-trial-p-dr-1605212&utm_content=text-ad-none-any-DEV_c-CRE_532287060476-ADGP_Desk+%7C+BKWS+-+PHR+%7C+Txt+~+Top-KWID_43700064911463909-kwd-6052401663&utm_term=KW_google+cloud-ST_google+cloud&gclid=CjwKCAiAioifBhAXEiwApzCztnlhgdVhJomjQJHXqxQRhF8QNKa6JsRQl6Rh3KrA5400sLaTGyZzjRoCaJgQAvD_BwE&gclsrc=aw.ds&hl=en)
- Ask Isaac to add you as an editor to the 350 Seattle project in Google Cloud.
- Install the [Pandas Big Query SDK](https://github.com/googleapis/python-bigquery-pandas), which allows you to access Big Query directly from Pandas. You will need to use Python3 and pip3 for this library. *Note*: this is *not* the same as the Big Query Python API.
- To authenticate, you have [two choices](https://googleapis.dev/python/pandas-gbq/latest/howto/authentication.html#id2):
1. Sign up for a [Google Cloud account](https://cloud.google.com/gcp?utm_source=google&utm_medium=cpc&utm_campaign=na-US-all-en-dr-bkws-all-all-trial-p-dr-1605212&utm_content=text-ad-none-any-DEV_c-CRE_532287060476-ADGP_Desk+%7C+BKWS+-+PHR+%7C+Txt+~+Top-KWID_43700064911463909-kwd-6052401663&utm_term=KW_google+cloud-ST_google+cloud&gclid=CjwKCAiAioifBhAXEiwApzCztnlhgdVhJomjQJHXqxQRhF8QNKa6JsRQl6Rh3KrA5400sLaTGyZzjRoCaJgQAvD_BwE&gclsrc=aw.ds&hl=en)
1. Ask Isaac to add you as an editor to the 350 Seattle project in Google Cloud.
1. Install the [Pandas Big Query SDK](https://github.com/googleapis/python-bigquery-pandas), which allows you to access Big Query data as a Pandas data frame. You will need to use Python3 and pip3 for this library. *Note*: this is *not* the same as the Big Query Python API.
1. To authenticate, you have [two choices](https://googleapis.dev/python/pandas-gbq/latest/howto/authentication.html#id2):
- Use Google Cloud authorization already cached on your machine
- The first time you run a query with the library, you'll be prompted to log in on a pop up window
- You should now be able to run the code in the [API example](experiments/big_query_api_example.ipynb) successfully.
1. You should now be able to run the code in the [API example](experiments/big_query_api_example.ipynb) successfully.
- If you see an error about the `tdqm` library, run `pip install tdqm` and restart your iPython kernel.
- Use the Pandas Big Query library to read and write data to the Source of Truth dataset in Big Query. Be sure to log all changes in our change log. Any changes you make will be visible in the Google Sheets display of the data.
1. Use the Pandas Big Query library to read and write data to the Source of Truth dataset in Big Query. Be sure to log all changes in our change log. Any changes you make will be visible in the Google Sheets display of the data.

This file was deleted.

2 changes: 1 addition & 1 deletion experiments/big_query_api_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
" LIMIT 10;\n",
"\"\"\"\n",
"\n",
"tax_parcel_id_dataframe = pandas_gbq.read_gbq(query, project_id='seattle-377109') #, progress_bar_type=None"
"tax_parcel_id_dataframe = pandas_gbq.read_gbq(query, project_id='seattle-377109')"
]
},
{
Expand Down