Skip to content

TheMrityunjayPathak/Supermarket-Sales-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Supermarket Sales Analysis

Grocery Stores are a vital part of everyday life, providing us with the food and essentials as we need. Many people utilizes grocery delivery applications to order their products making it easy to shop from home.

Each transaction made through these applications is recorded in detail creating a valuable dataset. This project looks at data from these transactions to understand how well these stores are performing.

Dataset

The dataset is sourced from Kaggle which simulates grocery sales activities within Tamil Nadu state of India.

The dataset includes various columns that provide detailed information about each transaction at the Supermarket.

Link to the Dataset : Supermarket Sales Dataset

Problem Statement

  • To gain insights into Supermarket Sales Performance understanding the patterns and trends in customer behavior, product categories and regional sales.

  • This Exploratory Data Analysis (EDA) aims to address the following key questions :

    • Customer Behavior Analysis : What are the purchasing patterns of customers based on different categories and sub-categories? How does customer spending vary across cities and states?

    • Sales Trends : Are there observable trends in sales over time? How do sales figures fluctuate across different months or seasons?

    • Discount Impact : What is the relationship between discounts and sales? How do discounts influence the profit margins across different categories and regions?

    • Profit Analysis : What are the profit margins associated with various product categories and sub-categories? How do these margins vary by city and state?

    • Regional Performance : How do sales and profit performance differ across different regions and states? Are there specific regions that contribute more significantly to overall sales and profits?

    • Category Insights : What are the most and least popular product categories and sub-categories? How does the popularity of these categories vary by location and over time?

  • This analysis will provide a deeper understanding of supermarket sales dynamics revealing trends and patterns that can inform inventory management, promotional strategies and regional marketing efforts.

Table of Contents

Setting up the Enviroment

Jupyter Notebook is required for this project and you can install and set it up in the terminal.

  • Install the Notebook
pip install notebook
  • Run the Notebook
jupyter notebook

Libraries required for the Project

Pandas

  • Go to the terminal and run this code
pip install pandas
  • Go to Jupyter Notebook and run this code from a cell
!pip install pandas

Matplotlib

  • Go to the terminal and run this code
pip install matplotlib
  • Go to Jupyter Notebook and run this code from a cell
!pip install matplotlib

Seaborn

  • Go to the terminal and run this code
pip install seaborn
  • Go to Jupyter Notebook and run this code from a cell
!pip install seaborn

Getting Started

  • Clone this repository to your local machine by using the following command :
git clone https://github.com/TheMrityunjayPathak/Supermarket-Sales-Analysis.git

Steps involved in the Project

Importing Libraries

  • Importing pandas, matplotlib and seaborn libraries

Reading CSV File

  • Reading csv file by using pd.read_csv() function

Overview of the Dataset

  • Information about shape and size of the dataset

  • Columns present in the dataset

  • Info about the dataset

Handling Null values in the Dataset

  • This dataset does not contain any null values

Unique values in Each Categorical Column

  • Unique values in customer name column

  • Unique values in category column

  • Unique values in sub category column

  • Unique values in city column

  • Unique values in region column

Changing DataType of Columns

  • Modifying the datatype of order date column to pandas datetime format

Utilizing existing information to create new Columns

  • Extracting year, month and dates from order date column

  • Extracting discount amount from discount percent by using mathematical formulas

Statistical Analysis

  • No. of products sold in each category

  • No. of products sold in each sub category

  • No. of products sold in each city

  • No. of products sold in each region

  • No. of products sold each year, month and date etc.

Data Visualization

  • No. of products sold in each category

download

  • No. of products sold in each sub category

download

  • No. of products sold in each city

download

  • No. of products sold in each region

download

  • No. of products sold each year

download

  • No. of products sold each month

download

  • No. of products sold each date

download

  • Total sales in each category

download

  • Total sales in each sub category

download

  • Total sales in each region

download

  • Total sales in each city

download

  • Total sales in each month

download

  • Total sales in each year

download

  • Total profit in each category

download

  • Total profit in each sub category

download

  • Total profit in each region

download

  • Total profit in each city

download

  • Total profit in each month

download

  • Total profit in each year

download

  • Customers with highest amount of total sales

download

  • Customers with highest profit on their purchase

download

  • Total discount availed by customers

download

Conclusion

  • The Exploratory Data Analysis (EDA) of the Supermarket Sales Dataset has provided a comprehensive understanding of the sales dynamics, customer behaviors and regional performance of the supermarket chain.

  • This analysis has provided a detailed understanding of various factors influencing supermarket performance.