💹 Financial NLP Sentiment Analysis

A Comparative Study of Classical ML, Deep Learning & Transformer Models

📌 Project Overview

This project implements and compares five NLP models for 3-class financial sentiment classification (Positive / Neutral / Negative) on the Financial PhraseBank dataset. It also includes a full MLOps pipeline covering model serving, containerisation, CI/CD, and cloud deployment.

Model	Accuracy	Weighted F1
Naive Bayes	68.9%	62.9%
Logistic Regression	69.3%	69.9%
SVM (Linear)	69.5%	70.4%
LSTM	55.4%	55.4%
FinBERT ✅	79.4%	79.0%

📁 Project Structure

Financial-NLP-Analysis/
│
├── 📓 notebooks/                      # Colab notebooks (pipeline stages)
│   ├── 01_data_exploration.ipynb
│   ├── 02_text_preprocessing_pipeline.ipynb
│   ├── 03_feature_engineering_and_split.ipynb
│   ├── 04_classical_ml_models.ipynb
│   ├── 05_deep_learning_models.ipynb
│   └── 06_finbert_sentiment_model.ipynb
│
├── 📊 data/
│   ├── raw/                           # Original datasets (from Kaggle/HuggingFace)
│   │   ├── financial_phrasebank.csv
│   │   └── Financial_Sentiment_Categorized.csv
│   └── processed/                     # Cleaned & preprocessed data
│       └── clean_phrasebank.csv
│
├── 🤖 models/                         # Saved model artefacts
│   ├── classical/                     # Sklearn .pkl model files
│   │   ├── naive_bayes_model.pkl
│   │   ├── logistic_regression_model.pkl
│   │   ├── svm_model.pkl
│   │   └── tfidf_vectorizer.pkl
│   ├── lstm/                          # Keras .h5 / SavedModel
│   │   └── lstm_sentiment_model.h5
│   └── finbert/                       # HuggingFace fine-tuned FinBERT
│       ├── config.json
│       ├── model.safetensors
│       ├── tokenizer_config.json
│       └── vocab.txt
│
├── 📈 results/
│   ├── metrics/                       # CSV files with model scores
│   │   ├── classical_model_results.csv
│   │   ├── lstm_results.csv
│   │   └── finbert_results.csv
│   ├── plots/                         # All visualisation PNGs
│   │   ├── sentiment_distribution.png
│   │   ├── sentence_length_distribution.png
│   │   ├── model_comparison.png
│   │   ├── lstm_training_curve.png
│   │   └── confusion_matrix_*.png
│   ├── predictions/                   # Model prediction CSVs
│   │   └── lstm_predictions.csv
│   └── TF-IDF_vectors/                # Serialised feature matrices
│       ├── X_train_tfidf.pkl
│       ├── X_test_tfidf.pkl
│       ├── y_train.pkl
│       └── y_test.pkl
│
├── 🚀 mlops/                          # MLOps deployment artefacts
│   ├── api/                           # FastAPI model serving
│   │   ├── main.py
│   │   └── predict.py
│   ├── docker/                        # Containerisation
│   │   ├── Dockerfile
│   │   └── docker-compose.yml
│   └── monitoring/                    # Model monitoring
│       └── monitor.py
│
├── 📄 reports/                        # Final deliverables
│   ├── nlp_case_study_report.docx
│   └── ieee_paper/
│       ├── main.tex
│       └── references.bib
│
├── .github/
│   └── workflows/
│       └── ci.yml                     # GitHub Actions CI/CD
│
├── requirements.txt
├── .gitignore
└── README.md

🚀 Quick Start

1. Clone the Repository

git clone https://github.com/YOUR_USERNAME/Financial-NLP-Analysis.git
cd Financial-NLP-Analysis

2. Install Dependencies

pip install -r requirements.txt

3. Run Notebooks in Order

Open in Google Colab or Jupyter and run sequentially:

01 → 02 → 03 → 04 → 05 → 06

4. Serve the Model (FastAPI)

cd mlops/api
uvicorn main:app --reload
# API available at http://localhost:8000

5. Run via Docker

docker-compose up --build

📦 Datasets

Dataset	Source	Size	Use
Financial PhraseBank	Kaggle	5,842 sentences	Training
Financial Sentiment Categorized	Kaggle	1,169 sentences	Testing

🧪 Model Artefacts

All trained models are saved in the models/ directory:

Classical models (.pkl) — serialised with joblib
LSTM (.h5) — saved with model.save()
FinBERT — saved with HuggingFace trainer.save_model()

🏗️ MLOps Pipeline

This project implements a production-grade MLOps pipeline:

Data Ingestion → Preprocessing → Training → Evaluation → Serving → Monitoring
      ↑                                                                   |
      └──────────────────── Feedback Loop ───────────────────────────────┘

Stage	Tool
Experiment Tracking	MLflow
Model Serving	FastAPI + Uvicorn
Containerisation	Docker + Docker Compose
CI/CD	GitHub Actions
Cloud Deployment	Google Cloud Run / HuggingFace Spaces
Monitoring	Custom drift detection

📝 Reports & Paper

📄 IEEE Paper (LaTeX)
📋 Case Study Report (.docx)

👤 Author

Kupakwashe T. Mapuranga Department of Computer Science & AI 📧 kupakwashemapuranga@gmail.com

📜 License

This project is licensed under the MIT License — see LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💹 Financial NLP Sentiment Analysis

A Comparative Study of Classical ML, Deep Learning & Transformer Models

📌 Project Overview

📁 Project Structure

🚀 Quick Start

1. Clone the Repository

2. Install Dependencies

3. Run Notebooks in Order

4. Serve the Model (FastAPI)

5. Run via Docker

📦 Datasets

🧪 Model Artefacts

🏗️ MLOps Pipeline

📝 Reports & Paper

👤 Author

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
mlops		mlops
models		models
notebooks		notebooks
reports		reports
results		results
.gitignore		.gitignore
GITHUB_PUSH_GUIDE.md		GITHUB_PUSH_GUIDE.md
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

💹 Financial NLP Sentiment Analysis

A Comparative Study of Classical ML, Deep Learning & Transformer Models

📌 Project Overview

📁 Project Structure

🚀 Quick Start

1. Clone the Repository

2. Install Dependencies

3. Run Notebooks in Order

4. Serve the Model (FastAPI)

5. Run via Docker

📦 Datasets

🧪 Model Artefacts

🏗️ MLOps Pipeline

📝 Reports & Paper

👤 Author

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages