- clone the repository
- on terminal: cd capstone-project
- create the conda environment from the yml file: conda env create -f environment.yml
- activate the environment: conda activate final-project
- preprocess the available data and create useful datasets: python createDataset.py
- create label distribution plots: python visuals.py
- run experiments with logistic regression model: python logisticRegression.py
- run experiments with xgboost model: python xgb.py
- run experiments with neural networks model: python nn.py
- run experiments with all models but with leave one out cross validation: python leaveOneOut.py, however this takes a while due to the nature of LOOCV method
- run the model explainer to see which features matter how much: python explainer.py
- run the inference to get top n similar samples given the new samples: python inference.py, remove the break in the loop if you want the results for all new samples
forked from hongcui/capstone-project-bulut
-
Notifications
You must be signed in to change notification settings - Fork 0
identify if two sample records are based on the same material sample
License
biosemantics/capstone-project-bulut
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
identify if two sample records are based on the same material sample
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Jupyter Notebook 71.2%
- Python 28.8%