Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ff10178
commit 1
rakeshreddy06 Oct 10, 2024
2a5b258
Update README.md
rakeshreddy06 Oct 10, 2024
1e5ac73
Update README.md
rakeshreddy06 Oct 10, 2024
a043a7b
Update README.md
rakeshreddy06 Oct 10, 2024
70a2126
Update README.md
rakeshreddy06 Oct 10, 2024
6020b09
Update README.md
rakeshreddy06 Oct 10, 2024
c0248df
Update README.md
rakeshreddy06 Oct 10, 2024
46c8a26
Update README.md
rakeshreddy06 Oct 10, 2024
6ab5664
next commit
rakeshreddy06 Oct 10, 2024
fc6585c
removed elasticnet
rakeshreddy06 Oct 10, 2024
13fe887
tree
rakeshreddy06 Oct 10, 2024
6bdb751
reverse
rakeshreddy06 Oct 10, 2024
86e8d6e
Update README.md
rakeshreddy06 Oct 10, 2024
72f9eea
Update README.md
rakeshreddy06 Oct 10, 2024
0630446
Update README.md
rakeshreddy06 Oct 10, 2024
7f29156
Update README.md
rakeshreddy06 Oct 10, 2024
898ce9d
add: pytest testing file added
nishant-k02 Oct 10, 2024
e62d613
add: README Updated with names
nishant-k02 Oct 10, 2024
ce3dbfe
Update README.md
rakeshreddy06 Oct 10, 2024
6e68e9c
Update README.md
rakeshreddy06 Oct 10, 2024
48c7167
Update README.md
rakeshreddy06 Oct 10, 2024
006a480
Update README.md
rakeshreddy06 Oct 10, 2024
e7cf6a0
Update README.md
rakeshreddy06 Oct 10, 2024
6bdb5c4
Update README.md
rakeshreddy06 Oct 10, 2024
7a313cf
Update README.md
rakeshreddy06 Oct 11, 2024
82dcefa
added a note to the read me
rakeshreddy06 Oct 11, 2024
3ba9621
making improvement
rakeshreddy06 Oct 14, 2024
2e347d4
Update finalLDA.py
rakeshreddy06 Oct 14, 2024
819833a
added generated_data to test on more data
rakeshreddy06 Oct 14, 2024
deae62a
Update README.md
rakeshreddy06 Oct 14, 2024
41f9506
Update README.md
rakeshreddy06 Oct 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
308 changes: 308 additions & 0 deletions Final_LDA.ipynb

Large diffs are not rendered by default.

100 changes: 94 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,96 @@
# Project 1
# README

**TEAM:**
- Rakesh Reddy - A20525389
- Geeta Hade - A20580824
- Nishant Khandhar - A20581012
- Amogh Vastrad - A20588808



## Overview


This project implements a Linear Discriminant Analysis (LDA) model for dimensionality reduction and classification. The model projects high-dimensional data into a lower-dimensional space, maximizing the distance between classes. It is particularly useful for visualizing data with a reduced number of features.



### When to Use the Model

- **Classification Tasks**: Suitable for classifying data into different classes.
- **Multivariate Normal Distribution Assumption**: Effective when data for each class follows a normal distribution.
- **Dimensionality Reduction**: Reduces data to lower dimensions while maintaining class separability.


## Running the Program

### Step 1: Generate the Data

To generate synthetic data, use the following command:
```bash
py .\testDataGenerator.py -N <Sample size> -f <features> -c <classes> -seed 10000 -output_file generated_data.csv
```



Example:
```bash
py .\testDataGenerator.py -N 150 -f 8 -c 5 -seed 10000 -output_file generated_data.csv

```


### Step 2: Run the Program

Execute the final LDA model:
Note: Final_LDA.ipynb is our final code and notebook file to test our program,

```bash
Final_LDA.ipynb

```
The program execution steps are shown in the notebook file
```bash
Final_LDA.ipynb

```




## Model Testing

### Step 1: Training on Available Dataset (Iris)

- Initially, the model was built using the Iris dataset, a common classification problem involving classifying flowers into one of three different iris species.

### Step 2: Generated Data

- The model was tested using synthetic data generated from `testDataGenerator.py`, allowing for testing under various conditions with variable classes, samples, and features.

### Step 3: Manual Train-Test Split

- Data is split into training and test sets to evaluate performance, preventing overfitting and ensuring accurate training.

### Step 4: Accuracy Calculation

- The model’s predicted classes are compared against the true classes to calculate accuracy.

## User-Exposed Parameters

Users can adjust the following parameters to tune model performance:

- **nComponents**: Number of components to keep in LDA, controlling dimensionality reduction.
- **Regularization**: Adds regularization to the within-scatter matrix, preventing overfitting in high-dimensional data.
- **Solver**: Allows choice of eigenvalue computation method, introducing flexibility.
- **Test Size**: Determines the percentage of data allocated for testing.
- **Random State**: Ensures reproducibility of train-test splits and generated data.

## Known Limitations

The implementation may encounter difficulties with:

1. **Highly Imbalanced Data**: Struggles when class sizes are noticeably imbalanced.
2. **High Dimensionality with Few Samples**: May exhibit irregular behavior when the number of features exceeds the number of samples. Dimensionality reduction using PCA can be a solution.

Put your README here. Answer the following questions.

* What does the model you have implemented do and when should it be used?
* How did you test your model to determine if it is working reasonably correctly?
* What parameters have you exposed to users of your implementation in order to tune performance? (Also perhaps provide some basic usage examples.)
* Are there specific inputs that your implementation has trouble with? Given more time, could you work around these or is it fundamental?
Empty file removed elasticnet/__init__.py
Empty file.
17 changes: 0 additions & 17 deletions elasticnet/models/ElasticNet.py

This file was deleted.

Empty file removed elasticnet/models/__init__.py
Empty file.
51 changes: 0 additions & 51 deletions elasticnet/tests/small_test.csv

This file was deleted.

19 changes: 0 additions & 19 deletions elasticnet/tests/test_ElasticNetModel.py

This file was deleted.

40 changes: 0 additions & 40 deletions generate_regression_data.py

This file was deleted.

Loading