Skip to content

Latest commit

 

History

History
73 lines (49 loc) · 4.61 KB

readme.md

File metadata and controls

73 lines (49 loc) · 4.61 KB

Rolling Stone Album Rankings

This week we're looking at album rankings from Rolling Stone. h/t Data is plural. A visual essay from The Pudding looks at what makes an album the greatest of all time, and shares the data they put together for the essay.

A new visual essay from The Pudding compares Rolling Stone’s “500 Greatest Albums of All Time” lists from 2003, 2012, and 2020. A methodology note says the project began with a spreadsheet by Chris Eckert and eventually led the authors to develop a dataset of their own. Theirs lists every album in the rankings — its name, genre, release year, 2003/2012/2020 rank, the artist’s name, birth year, gender, and more — plus each year’s voters. [h/t Jason Kottke]

What are the characteristics of artists and genres popular at different times?

The Data

# Option 1: tidytuesdayR package 
## install.packages("tidytuesdayR")

tuesdata <- tidytuesdayR::tt_load('2024-05-07')
## OR
tuesdata <- tidytuesdayR::tt_load(2024, week = 19)

rolling_stone <- tuesdata$rolling_stone


# Option 2: Read directly from GitHub

rolling_stone <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-05-07/rolling_stone.csv')

How to Participate

  • Explore the data, watching out for interesting relationships. We would like to emphasize that you should not draw conclusions about causation in the data. There are various moderating variables that affect all data, many of which might not have been captured in these datasets. As such, our suggestion is to use the data provided to practice your data tidying and plotting techniques, and to consider for yourself what nuances might underlie these relationships.
  • Create a visualization, a model, a shiny app, or some other piece of data-science-related output, using R or another programming language.
  • Share your output and the code used to generate it on social media with the #TidyTuesday hashtag.

Data Dictionary

rolling_stone.csv

variable class description
sort_name character Name used for sorting
clean_name character Clean name
album character Album name
rank_2003 double Rank in 2003. NA if album not released yet or not in top 500.
rank_2012 double Rank in 2012. NA if album not released yet or not in top 500.
rank_2020 double Rank in 2020. NA if not in top 500.
differential double 2020-2003 Differential. Negative value if it went down in the chart. Positive value if it went up.
release_year double Release Year
genre character Album Genre
type character Album Type
weeks_on_billboard double Weeks on Billboard
peak_billboard_position double Peak Billboard Position
spotify_popularity double Spotify Popularity. NA if not on Spotify.
spotify_url character Spotify URL. NA if not on Spotify.
artist_member_count double Number of artists in the group
artist_gender character Gender of the artist(s). Male/Female if it's a mixed-gender group.
artist_birth_year_sum double Sum of the artists birth year. e.g. for a 2 member group, with one person born 1945 and another 1950, the value is 3895.
debut_album_release_year double Debut Album Release Year
ave_age_at_top_500 double Average age at top 500 Album
years_between double Years Between Debut and Top 500 Album
album_id character Album ID. NOS at the beginning of the ID if not on Spotify.

Cleaning Script

Downloaded from Rolling Stone 500 (public).

Changed column names, replacing white space with underscores, and making all letters lowercase.

Removed Chartmetric Link and Album ID Quoted columns.

Removed "N/A" and "Not on Spotify" and "-" characters, replacing with empty cells.