Welcome to my Data Analysis Portfolio, where I showcase various real-world projects leveraging Python, Pandas, NumPy, Matplotlib, Seaborn, Plotly, and more. Each project involves data cleaning, exploratory data analysis (EDA), visualization, and where applicable, predictive insights and feature engineering.
This repository includes analyses across various domains such as health, e-commerce, tech, and transportation.
Each folder contains:
dataset/: The raw and cleaned datasetsnotebooks/: Jupyter/Colab notebooks with full analysisreports/: Summary reports or presentation slidesREADME.md: Project-specific documentation
- Objective: Analyze and visualize the spread of COVID-19 globally.
- Features:
- Time-series trend analysis by country.
- Daily new cases, recoveries, and deaths.
- Choropleth maps using Plotly for global impact.
- Forecasting trends with moving averages.
- Tech Used: Pandas, Matplotlib, Plotly, GeoPandas
- Objective: Use Google Trends data to analyze public interest over time.
- Features:
- Extract keyword trends using
pytrendsAPI. - Compare multiple terms (e.g., COVID vs. Vaccine).
- Heatmaps for region-wise interest.
- Weekly vs Monthly interest variation.
- Extract keyword trends using
- Tech Used: PyTrends, Pandas, Seaborn, Plotly
- Objective: Study the sales patterns and trends of iPhone models.
- Features:
- Quarterly and yearly revenue breakdown.
- Correlation of marketing spend vs. sales.
- Prediction of future sales using Linear Regression.
- Profit margin and market share visualization.
- Tech Used: Pandas, Seaborn, Scikit-Learn, Matplotlib
- Objective: Analyze ride data to derive business insights for Uber.
- Features:
- Trip frequency analysis per day/time.
- Idle time and rush hour heatmaps.
- Outlier detection in trip durations.
- Driver utilization rates.
- Tech Used: Pandas, Matplotlib, Plotly, NumPy
- Objective: Collect and analyze data from live websites.
- Features:
- Web scraping using
BeautifulSoupandrequests. - Clean and normalize scraped data.
- Store in CSV or SQLite database.
- Visualize top trends or products from the website.
- Web scraping using
- Tech Used: BeautifulSoup, Pandas, Matplotlib
- Objective: Analyze Zomato’s restaurant data to uncover user trends.
- Features:
- Cuisine-based rating and price analysis.
- Location-wise restaurant density and performance.
- Sentiment analysis of reviews (optional advanced feature).
- Recommend top restaurants by region.
- Tech Used: Pandas, Seaborn, Matplotlib, NLP (optional)
- 📈 Interactive Dashboards using
PlotlyandStreamlit(optional extensions). - 🧼 Data Cleaning Pipelines using functions/classes.
- 🧠 Basic Machine Learning integration (Regression/Clustering).
- 📊 Custom Visualizations with annotations and interactivity.
- 📝 Automated Report Generation with
nbconvertorJinja2.
| Category | Tools/Libraries |
|---|---|
| Data Handling | pandas, numpy |
| Visualization | matplotlib, seaborn, plotly, geopandas |
| ML (where used) | scikit-learn, statsmodels |
| Web Scraping | beautifulsoup4, requests, pytrends |
| IDE/Environment | VS Code, Jupyter Notebook, Google Colab |
# Clone the repository
git clone https://github.com/yourusername/data-analysis-projects.git
cd data-analysis-projects
# Open Jupyter or VS Code
jupyter notebook
# or
code .