This Jupyter Notebook demonstrates how to build a linear regression model for predicting stock prices using historical stock data. The code uses various Python libraries for data preprocessing, model training, and evaluation.
This notebook uses the TensorFlow library to create a linear regression model. The dataset used in this example is a historical stock data file from the National Stock Exchange (NSE).
The following steps are covered in the notebook:
-
Importing Libraries: TensorFlow, NumPy, Pandas, yfinance, Matplotlib, and scikit-learn are imported for data processing and model creation.
-
Data Loading: The historical stock data file is loaded into a Pandas DataFrame.
-
Data Preprocessing: The dataset is cleaned, and missing values are handled. Outliers are detected and handled using the Interquartile Range (IQR) method. Numerical features are scaled and normalized.
-
Feature Selection: The relevant features for the model are selected, and the target variable is defined.
-
Data Splitting: The dataset is split into training and testing sets.
-
Polynomial Transformation: Polynomial features are generated to allow for nonlinear relationships in the model.
-
Model Training: A linear regression model is created using scikit-learn's LinearRegression class.
-
Model Evaluation: The model's performance is evaluated using metrics such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared (R2).
-
Visualization: A scatter plot with a regression line is generated to visualize the actual vs. predicted values.
Before running the code, ensure you have the following dependencies installed:
- TensorFlow
- NumPy
- Pandas
- yfinance
- Matplotlib
- scikit-learn
You can create a virtual environment and install the required packages by using the provided requirements.txt
file. Use the following command to install the dependencies:
pip install -r requirements.txt
-
Ensure that the required dependencies are installed as mentioned above.
-
Copy and paste the code from this notebook into your local Jupyter Notebook or any other Python development environment.
-
Run each cell step by step to execute the code sequentially.
-
Observe the model evaluation metrics and the scatter plot visualizing the actual vs. predicted values.
The original file for this project can be found on Google Colab. You can access it using the following link:
This file was automatically generated by Colaboratory.
Kindly note that this model is still on its testing phase
Feel free to modify the code, experiment with different features, or apply different machine learning models to improve the stock price prediction performance.
Happy coding! 🚀