Solution for Assignment 1 of the Udacity Full Stack nanodegree. The task is to create a reporting tool to answer questions about
data in a database:
- What are the most popular three articles of all time?
- Who are the most popular article authors of all time?
- On which days did more than 1% of requests lead to errors?
Download the required database from here. To setup use:
psql -d news -f <news sql database file>
Download Python 3.6 - required to run the python script. Documentation available here
psycopg2 - required to connect to database. Documentation available here. To install run:
pip install psycopg2
The databse has three tables:
- Articles - has columns author, title, slug, lead, body, time, id. Note that author is an ID matching table authors column ID.
- Authors - has columns name, bio and ID.
- Log - has columns path, IP, method, status, time and ID. Rows include information on traffic to all articles, including status of HTTP codes sent to user's browser.
Use the following before running the python script to create the views used in the code. This step is only required once.
psql -d news -f create_views.sql
View 1 is used to join the tables articles and authors
View 2 is used to count the views of articles by using the url and article slug from tables articles and log
View 3 is used to group and count total logs for each day
View 4 is used to group and count error logs for each day
View 5 is used to get the % error each day.
python Assignment.py
What are the most popular three articles of all time?
"338647" - Candidate is jerk, alleges rival views
"253801" - Bears love berries, alleges bear views
...