This repository contains data analytics work, including exploratory data analysis, statistical modeling, and machine learning. It includes end-to-end projects as well as lab exercises that demonstrate data preprocessing, visualization, predictive modeling, and model evaluation.
End-to-end data analytics projects including dataset, code, and final report:
- Environmental Indicator Modeling and Regional Classification
- Flight Delay Prediction
- NYC Real Estate Analysis and Neighborhood Classification
- Obesity Classification and Weight Prediction
Hands-on exercises focusing on specific analytics concepts and techniques.
- lab1/ – EDA on epi dataset with summary stats and visualization
- lab2/ – Housing price prediction using linear regression
- lab3/ – Abalone age classification with k-NN and clustering
- lab4/ – Wine classification using PCA and k-NN
- lab5/ – Classification (SVM, Naive Bayes) and regression on wine and housing datasets
- lab6/ – Regression model comparison with cross-validation and error metrics
- Languages: R
- Libraries: dplyr, tidyr, ggplot2, caret, randomForest, class, e1071, DescTools, stats
- Other Tools: Git, RStudio
Inah Lee