Data Analysis of Movie Industry from 1980-2020

This project focuses on conducting exploratory data analysis and data visualization of films produced between 1980-2020. The project aims to analyze and explore the relationships between various features of the films, such as the genre, rating, budget, gross, director, writer, and runtime.

By conducting exploratory data analysis, we can identify trends, patterns, and relationships within the dataset, which can then be visualized using various data visualization techniques such as bar charts, scatter plots, and histograms. This will help to better understand the characteristics of the films produced during this period, and enable us to make predictions about future films.

Dataset is from Kaggle

You may want to take a look at the report.

Data Analysis Report (PDF)

Question : How were the films selected?

Data Collector's answer :

First of all, the data was automatically scraped using a Python script, using IMDb's advanced search tool. Here's an example query using just the year and type of content (feature film):

https://www.imdb.com/search/title?title_type=feature&release_date=1980-01-01,1980-12-31&count=100

This returns 100 films from the year 1980, ordered by popularity. The script simply selects all those films, one by one, from top to bottom.

That's the only criteria really, popularity.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
graphs		graphs
LICENSE		LICENSE
README.md		README.md
data-analysis-of-movie-industry-report.pdf		data-analysis-of-movie-industry-report.pdf
film-analysis.ipynb		film-analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Analysis of Movie Industry from 1980-2020

You may want to take a look at the report.

Question : How were the films selected?

In which range of IMDb scores are movies most commonly rated?

Which genres have the highest average ratings?

How is the distribution of movie releases across years? Is there an increase or decrease in specific genres during a particular period?

What is the relationship between budget and revenue? Do high-budget films generate higher revenues?

Is there a relationship between the duration of films and their ratings, budgets, and revenues?

What are the director-star pairs that have collaborated the most, have the highest average IMDb score, and have earned the most revenue together?

Which companies earned the most revenue?

How many films were released in each season? How did the revenues of the films vary by the season they were released in?

Which actors have appeared in the most films?

About

Releases

Packages

Languages

License

gururaser/film-analysis-project

Folders and files

Latest commit

History

Repository files navigation

Data Analysis of Movie Industry from 1980-2020

You may want to take a look at the report.

Question : How were the films selected?

In which range of IMDb scores are movies most commonly rated?

Which genres have the highest average ratings?

How is the distribution of movie releases across years? Is there an increase or decrease in specific genres during a particular period?

What is the relationship between budget and revenue? Do high-budget films generate higher revenues?

Is there a relationship between the duration of films and their ratings, budgets, and revenues?

What are the director-star pairs that have collaborated the most, have the highest average IMDb score, and have earned the most revenue together?

Which companies earned the most revenue?

How many films were released in each season? How did the revenues of the films vary by the season they were released in?

Which actors have appeared in the most films?

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages