- Go to spotify on the browser, click on "account." Then, click on "account privacy," and scroll down to download your extended streaming history. It says it may take up to 30 days, but it only took 3 for me.
- Once your data is ready, spotify will email you a zip file containing multiple streaming history JSON files. To work with them, we want to convert them into a single CSV file. In order to do this, simply download the "spotifyjsontocsv.py" file, drag it into the folder with all of your streaming history files, and run the following command:
python3 spotifyjsontocsv.py . my_history.csv. The first file is the file that will do the converting, the "." selects the files from the current directory, and "my_history.csv" is the file in which the output will be. - Now that you have this CSV file, it is pretty simple to turn it into a database. First, make sure you are connected to your PostgreSQL server.
- Next, open a new query window. We are going to run a few table creation commands to clean up our data. The first command is:
CREATE TABLE staging_history ( ts TIMESTAMP WITH TIME ZONE, master_metadata_album_artist_name TEXT, master_metadata_track_name TEXT, master_metadata_album_album_name TEXT, ms_played INT, skipped BOOLEAN, audiobook_chapter_title TEXT, audiobook_chapter_uri TEXT, audiobook_title TEXT, audiobook_uri TEXT, conn_country TEXT, episode_name TEXT, episode_show_name TEXT, incognito_mode BOOLEAN, ip_addr TEXT, offline BOOLEAN, offline_timestamp BIGINT, platform TEXT, reason_end TEXT, reason_start TEXT, shuffle BOOLEAN, spotify_episode_uri TEXT, spotify_track_uri TEXT);
This temporary table contains ALL the fields we extracted from our listening data. In order to decrease our table size, we only select the fields we intend to use.
- The next command creates the table in which we want to store these intended fields.
CREATE TABLE streaming_history (
endTime TIMESTAMP WITH TIME ZONE,
artistName TEXT,
trackName TEXT,
albumName TEXT,
msPlayed INT,
skipped BOOLEAN);Feel free to add or take away fields as you see fit.
-
Then, we will import our data into our staging_history table. In order to do so, open Schemas on the left side bar, and from there open Tables. next, right click on staging_history, and select "Import/Export Data." Select import, your my_history.csv file, format: csv, encoding: utf8, toggle Header on, delimiter: ",", and set escape to " (the double quote.) By default, the escape character is the single quote, which will cause an error if your album titles contain apostrophes. Click "OK".
-
Next, we'll move the data from staging into our actual table. Run the following command:
INSERT INTO streaming_history (endTime, artistName, trackName, albumName, msPlayed, skipped)
SELECT
ts,
master_metadata_album_artist_name,
master_metadata_track_name,
master_metadata_album_album_name,
ms_played,
skipped
FROM staging_history;This moves the selected columns from staging_history into the specified columns of streaming_history.
- To clean up our database, we'll delete the extraneous staging_history table with the following command:
DROP TABLE staging_history;- Now your table is created, and you can run whatever SQL commands you see fit! A few fun ones are provided in the "sqlqueries.py" file! :)