Fall2024CS584 · adarsh-chidirala · Oct 11, 2024 · Oct 11, 2024 · Oct 11, 2024 · Oct 11, 2024
diff --git a/README.md b/README.md
@@ -1,8 +1,146 @@
 # Project 1 
 
-Put your README here. Answer the following questions.
+# Group Members
+* Sharan Rama Prakash Shenoy - A20560684
+* Adarsh Chidirala - A20561069
+* Venkata Naga Lakshmi Sai Snigdha Sri Jata - A20560684
 
-* What does the model you have implemented do and when should it be used?
-* How did you test your model to determine if it is working reasonably correctly?
-* What parameters have you exposed to users of your implementation in order to tune performance? (Also perhaps provide some basic usage examples.)
-* Are there specific inputs that your implementation has trouble with? Given more time, could you work around these or is it fundamental?
+# ###################################################
+## Usage Instructions
+
+### Installation
+
+To get started with this project, first you need **Python 3.x**. Then follow these installation steps:
+
+#### 1. Clone the Repository to your local machine:
+
+```bash
+git clone https://github.com/adarsh-chidirala/Project1.git
+```
+#### 2. You can install the required libraries using requirements.txt pip:
+
+```bash
+pip install -r requirements.txt
+```
+#### 3. Run the Test Script
+
+```bash
+# for windows
+py -m elasticnet.tests.test_ElasticNetModel 
+
+# or for mac
+pytest -s elasticnet/tests/test_ElasticNetModel.py 
+```
+This will run the test cases and print out the evaluation metrics and generate the plots.
+
+### Introduction
+This project is an implementation of a type of Linear Regression with ElasticNet regularization. This model is a combination of two regularization techniques i.e; Lasso and Ridge regression. They are represented as L1 and L2 respectively.
+ - **L1 Regularization** : It adds up a penalty which is equal to the sum of the absolute values of the model coefficients. This helps in feature selection, as it enables the model to identify and retain only the most significant features by eliminating those with zero coefficients. 
+ - **L2 Regularization** : It adds up a penalty which is equal to the sum of the square values of the model coefficients. This helps in reducing the size of the coefficients, helping to prevent overfitting, particularly in situations where the features are strongly correlated.
+
+### Usage of Elastic Net Regression
+
+- **Initialization**: Start by creating an instance of `ElasticNetRegression`, where you can set parameters to manage the levels of regularization. This is important for optimizing the  complexity and ensuring performance.
+
+- **Training**: Call the `fit` method to train the model using the dataset, which consists of input features and the output variable using training data set allocated. This step helps to learn and fine-tune the coefficients through the optimization of the Elastic Net loss function.
+
+- **Prediction**: Once the model is done with training, you can use the `predict` method to obtain predictions on the test datasets. This allows the model to leverage the relationships it has learned to make accurate forecasts for data that it has not encountered before.
+
+
+## 1. What does the model you have implemented do and when should it be used?
+
+### ElasticNet Model Overview
+
+The **ElasticNet model** enhances linear regression by incorporating both L1 and L2 regression model techniques where L1 is the lasso regression and L2 is the ridge regression. It's particularly useful when we have data where we want to **balance out bias and variance values** or if we are **handling some high-dimensional data**.
+
+The model we generated combines both L1 and L2 to give a better solution.We have more control as we can change the values of the hyperparameters which ensures that we can arrive at the best fit solution.
+
+
+## 2. How did you test your model to determine if it is working reasonably correctly?
+### Model Testing Process
+
+The strengths of the model have been demonstrated through several test cases designed to ensure it behaves reasonably under different conditions:
+
+1. **Standard dataset test**: We ran the model using a small test CSV file (`small_test.csv`) to check for reasonable predictions. Comparing actual and predicted values showed a good correlation. We also tried using a larger test CSV file (`data_long.csv`) so that we can check the accuracy by documenting the r_square values.
+
+2. **Highly correlated features test**: We tested the performance with highly correlated input features to see if ElasticNet could address multicollinearity effectively.
+
+3. **Alpha and L1 ratio variation**: Tried different combinations of `regularization_strength` and `l1_ratio` to understand their influence on the model's behavior.
+
+4. We provided an option to change the grid paramenters so that the user can uncomment and use any one of them according to the need to test large data sets with more accuracy we have provided a small grid parameters where we can compute it in 5 to 6 s while the large grid parameter option takes around 5 mins to compute for 3000 lines of data which is 0.2 of the total.
+
+Each test calculates **Mean Squared Error (MSE)**, **Mean Absolute Error (MAE)**, and **R-squared (R2)**. Additionally, **scatter** and **residual plots** are created to visualize the model's performance.
+
+
+## 3. What parameters have you exposed to users of your implementation in order to tune performance? (Also perhaps provide some basic usage examples.)
+### Tuning ElasticNet Model Parameters
+
+The ElasticNet model exposes the following parameters for tuning performance:
+
+- **regularization_strength**: Controls the degree of regularization applied. Higher values increase the penalty on model coefficients to reduce over-fitting.
+- **l1_ratio**: Determines the mix between L1 (Lasso) and L2 (Ridge) regularization. A value of 0 corresponds to pure Ridge, 1 corresponds to pure Lasso, and values between 0 and 1 blend both methods.
+- **max_iterations**: Sets the maximum number of iterations for the optimization algorithm.
+- **tolerance**: Defines the threshold for convergence; the algorithm stops when changes in the coefficients are smaller than this value.
+- **learning_rate**: Controls the step size during optimization, affecting the speed and stability of convergence.
+
+### Additional Code Explanation
+- These parameters can be adjusted by users to better match their datasets and improve model performance.
+- We have divided the data into two parts where 80 % of the data is for Training and 20 % is for testing the data.
+- We have written code where the results are also documented seperately in a file called "Results.txt" where the results for the specific test run is stored.
+- We are also storing the plot images to the directory for reference and comparison.
+- We included a definition called `ml_grid_search` to ensure that the hyperparameters can be changed accorfding to user requirement and so that the best fit model can be decided based on which hyperparameters.
+
+### Basic Usage Example
+
+```python
+    # This is for large grid parameter variations
+     regularization_strength_values = [0.01, 0.1, 0.5, 1.0, 5.0]
+     l1_ratio_values = [0.1, 0.2, 0.5, 0.7, 0.9]
+     learning_rate_values = [0.001, 0.01, 0.1]
+     max_iterations_values = [500, 1000, 2000]
+
+    # This is for small grid parameter variations
+    regularization_strength_values = [0.1, 0.5]
+    l1_ratio_values = [0.1, 0.5]
+    learning_rate_values = [0.01]
+    max_iterations_values = [1000]
+```
+
+## 4. Are there specific inputs that your implementation has trouble with? Given more time, could you work around these or is it fundamental?
+
+### Specific Inputs:
+
+- **Data Set With Variation Of Data**: We had a fundamental issue with the specific orientation and data arrangement of the data which caused errors during runtime.
+
+- **Hyperparameters**: We faced issues when we use less parameters for tuning which created less variation and less data for the model to test on, as well as presented an issue when we used more parameters it took long time to compile for example, 3000 lines of data training the model on 1000 iterations with 200+ choices of hyperparameters.
+
+### Workarounds:
+
+- **Data Set With Variation Of Data**: We Employed concept of Preprocessing the data where we understood the data which was going to be used and preprocessed the data by including a direct OS path and Specification of how the data was read and intepreted by the model.
+
+- **Choice of Hyperparameters**: Given more time, We added more choices in the hyperparemeters and made it more user controlled, We also esured that all the choices were considered the best fit for the model was also displayed. Incorporating features such as polynomial feature generation and plots helped us analyse the respective outputs.
+
+### Code Visualization:
+- The following screenshots display the results of each test case implemented in this project:
+
+### 1. Small_test.csv:
+- Tests the model on a small dataset, and verifies if the predictions are reasonable.
+- i. Training Loss:
+    ![Small Test Image](small_test1.jpeg)
+- ii. Predicted vs Actual:
+    ![Small Test Image](small_test2.jpeg)
+- iii. Residual plots:
+    ![Small Test Image](small_test3.jpeg)
+- iv. Final Results:
+    ![Small Test Image](small_test_output.jpeg)
+
+### 2. data_long.csv:
+- Tests the model on a large dataset, and verifies if the predictions are reasonable.
+- i. Training Loss:
+    ![Long Data Test Image](large_data1.jpeg)
+- ii. Predicted vs Actual:
+    ![Long Data Image](large_data2.jpeg)
+- iii. Residual plots:
+    ![Long Data Image](large_data3.jpeg)
+- iv. Final Results:
+    ![Small Test Image](large_data_output.jpeg)
diff --git a/elasticnet/README.pdf b/elasticnet/README.pdf
diff --git a/elasticnet/models/ElasticNet.py b/elasticnet/models/ElasticNet.py
@@ -1,17 +1,144 @@
+import numpy as np
 
+class ElasticNetModel:
+    def __init__(self, regularization_strength, l1_ratio, max_iterations, tolerance=1e-6, learning_rate=0.01):
+#  
+#         Set up the ElasticNet regression model.
 
-class ElasticNetModel():
-    def __init__(self):
-        pass
+#         Parameters used in the model are:
+#         regularization_strength: Regularization strength (λ or also called alpha)
+#         l1_ratio: The balance between L1 and L2 ratios. It ranges from 0 to 1 where 0(pure Ridge) and 1(pure lasso).
+#         max_iterations: Maximum number of iterations allowed for gradient descent process
+#         tolerance: Threshold. criteria where it determines when to exit the process
+#         learning_rate: Step size for updating coefficients during gradient descent process
+
+        self.reg_strength = regularization_strength
+        self.l1_ratio = l1_ratio
+        self.max_iterations = max_iterations
+        self.tolerance = tolerance
+        self.learning_rate = learning_rate
 
+    def loss_calculation(self, X, y, coefficients, intercept):
+        """Compute the ElasticNet loss i.e; sum of MSE (Mean Squared Error),
+                                            L1(Lasso),
+                                            L2(Ridge) penalties"""
+        predictions = X.dot(coefficients) + intercept
+        squared_error_loss = np.mean((y - predictions) ** 2)
+        l1_regularization = self.l1_ratio * np.sum(np.abs(coefficients))
+        l2_regularization = (1 - self.l1_ratio) * np.sum(coefficients ** 2)
+        return squared_error_loss + self.reg_strength * (l1_regularization + l2_regularization)
 
     def fit(self, X, y):
-        return ElasticNetModelResults()
+
+        # Train the model on the data by applying gradient descent
+        # Parameters used in this method are:
+        # X: Feature matrix (n_samples, n_features)
+        # y: Target vector (n_samples,)
+
+        n_samples, n_features = X.shape
 
+        # Normalize the features
+        feature_mean = np.mean(X, axis=0)
+        feature_std = np.std(X, axis=0)
+        X = (X - feature_mean) / feature_std
 
-class ElasticNetModelResults():
-    def __init__(self):
-        pass
+        # Initialize coefficients and intercept
+        coefficients = np.zeros(n_features)
+        intercept = 0
+        loss_history = []
 
-    def predict(self, x):
-        return 0.5
+        for iteration in range(self.max_iterations):
+            predictions = X.dot(coefficients) + intercept
+            residuals = predictions - y
+
+            # Compute gradient for intercept
+            intercept_gradient = np.sum(residuals) / n_samples
+            intercept -= self.learning_rate * intercept_gradient
+
+            # Compute gradient for coefficients (ElasticNet penalty)
+            coef_gradient = X.T.dot(residuals) / n_samples + \
+                            self.reg_strength * (self.l1_ratio * np.sign(coefficients) +
+                                                (1 - self.l1_ratio) * 2 * coefficients)
+
+            # Update coefficients
+            coefficients -= self.learning_rate * coef_gradient
+
+            # Record the loss
+            loss = self.loss_calculation(X, y, coefficients, intercept)
+            loss_history.append(loss)
+
+            # Stopping condition (based on gradient tolerance)
+            if np.linalg.norm(coef_gradient) < self.tolerance:
+                break
+
+        # Return the fitted model and results encapsulated in ElasticNetModelResults
+        return ElasticNetModelResults(coefficients, intercept, feature_mean, feature_std, loss_history)
+
+class ElasticNetModelResults:
+    def __init__(self, coefficients, intercept, feature_mean, feature_std, loss_history):
+
+        # Wraps the outcomes of the ElasticNet model following the fitting process.
+
+        # Parameters used in the method are:
+
+        # coefficients: Coefficients obtained from fitting the model.
+        # intercept: Intercept value determined during model fitting.
+        # feature_mean: Average value of the features (utilized for normalization).
+        # feature_std: Standard deviation of the features (utilized for normalization).
+        # loss_history: Record of the loss values tracked throughout the training process.
+
+        self.coefficients = coefficients
+        self.intercept = intercept
+        self.feature_mean = feature_mean
+        self.feature_std = feature_std
+        self.loss_history = loss_history
+
+    def predict(self, X):
+
+        # Generate predicteds target values based on the provided input features
+
+        # Parameters used in the method are:
+
+        # X: Feature matrix for which predictions will be generated.
+        # Returns:
+        # predictions: The predicted target values.
+
+        # Normalize the input data with the same scaling applied in fit
+        X = (X - self.feature_mean) / self.feature_std
+        return X.dot(self.coefficients) + self.intercept
+
+    def r_squared(self, X, y_true):
+
+        # Compute there  R-squared value for the model using the provided data.
+
+        # Parameters used in the method are :
+
+        # X: Feature matrix.
+        # y_true: Actual target values.
+        # Returns:
+
+        # R² value: The calculated R-squared statistic.
+
+        # Predict the values
+        predictions = self.predict(X)
+
+        # Total sum of squares (variance of the data)
+        ss_total = np.sum((y_true - np.mean(y_true)) ** 2)
+
+        # Residual sum of squares (variance of the errors)
+        ss_residual = np.sum((y_true - predictions) ** 2)
+
+        # Compute R²
+        r2 = 1 - (ss_residual / ss_total)
+        return r2
+
+
+    def display_output_summary(self):
+
+        # Print a summary of the fitted model, including coefficients and intercept.
+
+        print("Model Summary:")
+        print(f"Intercept: {self.intercept}")
+        print(f"Coefficients: {self.coefficients}")
+        print(f"Number of iterations: {len(self.loss_history)}")
+        print(f"Final loss: {self.loss_history[-1]}" if self.loss_history else "No loss. recorded.")
diff --git a/elasticnet/results.txt b/elasticnet/results.txt
@@ -0,0 +1,18 @@
+Best R² Score: 0.9887113568818682
+Best Hyperparameters: {'regularization_strength': 0.1, 'l1_ratio': 0.5, 'learning_rate': 0.01, 'max_iterations': 1000}
+Total number of predicted values: 3000
+Predicted values: [12.26781717  8.87515958 15.61695438 ... 14.98777201 10.7556034
+  5.019707  ]
+Actual values: [12.30484596  8.32094416 16.03569987 ... 15.30970911 10.5781752
+  4.03673413]
+Differences: [0.03702879 0.55421542 0.41874549 ... 0.32193711 0.17742819 0.98297287]
+Model Summary:
+Intercept: 12.00849875191871
+Coefficients: [3.86544261 1.26678141]
+Number of iterations: 1000
+Final loss: 1.3163719320861063
+Final R² Score: 0.9887113568818682
+Residuals: [-0.03702879  0.55421542 -0.41874549 ... -0.32193711  0.17742819
+  0.98297287]
+Mean Residual: -0.005037729545604678
+Standard Deviation of Residuals: 0.48516459122309363
diff --git a/elasticnet/tests/__init__.py b/elasticnet/tests/__init__.py
@@ -0,0 +1 @@
+