This is a prediction application designed to find an individual's ancestorial origins based on genetic data.
The dataset used is obtained from Kaggle.
Technology used : Data Analysis and Machine Learning
Exploratory Data Analysis : Involves data cleaning and preprocessing features to suitable format for model training and testing.
Data Visualisation : Done through matplotlib and seaborn
Predictive Data Analysis : Building machine learning models using various classification algorithms and choosing the one with the most appropriate prediction score.
The LogisticRegression model is found to give a score of 1.0 for the given dataset and hence is used for predictions.
The model is integrated into a web UI using flask.
Predictions are made on the web UI by collecting genetic information from the user.