Mrityunjay Pathak TheMrityunjayPathak

About     Skills     Projects     Certificates     Blogs

About

Hi 👋, I'm Mrityunjay Pathak

I'm a Data Scientist with a knack for uncovering patterns and trends that drive smarter decisions.

🎯 Tools and Technologies

• Programming Language : I'm familiar with Python, a powerful language for data science and machine learning.

• Libraries : I'm also familiar with essential data science libraries like NumPy, Pandas, Matplotlib, Seaborn and Plotly.

• Machine Learning : I have experience with Scikit-learn, a famous machine learning library used widely across industries.

• Database : I can work with MySQL, a popular database management system to handle and retrieve data effectively.

• BI Tool : I'm familiar with Power BI to perform data analysis, create dynamic dashboards and extract meaningful insights.

• Web Framework : I have experience with FastAPI, a high-performance web framework for building APIs with Python.

• Containerization : I can work with Docker for packaging application and their dependencies into containers.

• Version Control : I'm familiar with Git, which helps in keeping track of changes in code and collaborating effectively with a team.

📫 Connect with Me

Kaggle  |  LinkedIn  |  GitHub  |  Medium  |  Portfolio

Skills

Projects

AutoIQ : Car Price Prediction

➔ Problem

In the used car market, buyers and sellers often struggle to determine a fair price for their vehicle.

This project aims to provide accurate and transparent pricing for used cars by analyzing real-world data.

It will assist both buyers and sellers make data-driven decisions and ensure fair transactions.

➔ Solution

To address this problem, I built and deployed a complete end-to-end machine learning pipeline :

Data Collection

Scraped a dataset of 2,800+ used cars from Cars24 using Selenium and BeautifulSoup.

Data Optimization

Optimized memory consumption of dataset by downcasting data types.

Stored the dataset in Parquet format, which compresses data without losing information.

It also provides much faster read/write speeds compared to CSV.

Preprocessing & Modeling

Implemented Scikit-learn Pipelines & ColumnTransformer to prevent data leakage.

API Deployment

Deployed the machine learning model as an API using FastAPI, with :

/predict endpoint for real-time predictions.

/health endpoint for monitoring API status.

Input validation & rate limiting for reliability.

Frontend Integration

Designed a HTML/CSS/JS website to send API calls and display predictions in a user-friendly way.

Containerization

Created a multi-stage Dockerfile with .dockerignore for building an optimized and lightweight Docker image.

➔ 𝗜𝗺𝗽𝗮𝗰𝘁

Built and deployed a complete machine learning pipeline as a FastAPI application.

Reduced dataset memory usage by 90% through data type optimization and Parquet conversion.

Delivered 30% lower MAE and 12% higher R2-Score compared to the baseline model.

Improved model stability by 70%, ensuring more consistent and reliable predictions.

Pickify : Movie Recommender System

➔ Problem

With the rise of streaming services, viewers now have access to thousands of movies across platforms.

As a result, many viewers spend more time browsing than actually watching.

This problem can lead to frustration, lower satisfaction and less time spent on the platform.

Which can impact both the user experience and business performance.

➔ Solution

A content-based movie recommender system built with clean and modular code with proper version control.

It analyzes metadata of 5000+ movies to recommend top 5 similar titles based on a user selected input.

The system uses techniques like CountVectorizer and CosineSimilarity to recommend similar movies.

The project not only focuses on functionality but on building a clean and scalable solution.

➔ Impact

If this system gets scaled and integrated with a streaming service, this could :

Reduce the time users spend choosing what to watch.

Increase user engagement, watch time and customer satisfaction.

Help streaming platforms retain users by offering better personalized content.

Netflix Data Analysis

➔ Objective

To analyze Netflix content data, uncovering valuable insights into how the platform evolves over time.

➔ 𝗦𝗼𝗺𝗲 𝗞𝗲𝘆 𝗙𝗶𝗻𝗱𝗶𝗻𝗴𝘀

Cleaned and analyzed dataset of 8000+ Netflix Movies and TV Shows.

More than 60% of content on Netflix is rated for mature audiences.

Suggests that Netflix targets adult viewers to boost engagement and retention.

More than 25% of Movies and TV Shows are released on 1st day of the month.

Shows a consistent release schedule, likely to align with subscription cycles.

More than 40% of the content on Netflix is exclusive to United States.

Shows a strong focus on the U.S. market and content availability by location.

More than 20% of the content on Netflix falls under the "Drama" genre.

Confirms that "Drama" is a key part of Netflix's content library.

More than 23% of the content on Netflix was released in 2019 alone.

Indicates a major content push that year, possibly tied to growth or user acquisition goals.

Supermarket Sales Analysis

➔ Objective

To analyze Supermarket Sales data, identifying key factors for improving profitability and operational efficiency.

➔ 𝗦𝗼𝗺𝗲 𝗞𝗲𝘆 𝗙𝗶𝗻𝗱𝗶𝗻𝗴𝘀

Analyzed purchasing pattern of 9000+ customers of Supermarket.

More than 15% of the products sold were Snacks.

Shows that Snacks are a convenient choice and a big source of revenue.

More than 32% of the sales were occurred in West region of Supermarket.

Suggests that West region is a strong performing area as compared to others.

Health and Soft drinks are the most profitable category in Beverages.

Shows that both type of drinks option sells well.

November was the most profitable month contributing about 15% of the total annual profits.

Makes it an ideal time for running promotions and special offers.

Certificates



Blogs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mrityunjay Pathak TheMrityunjayPathak

Block or report TheMrityunjayPathak

About

Skills

Projects

AutoIQ : Car Price Prediction

Pickify : Movie Recommender System

Netflix Data Analysis

Supermarket Sales Analysis

Certificates

Blogs

Pinned Loading

Uh oh!