Email spam detection system is used to detect email spam using Machine Learning technique called Natural Language Processing and Python, where we have a dataset contain a lot of emails by extract important words and then use naive classifier we can detect if this email is spam or not.
This is list of required packages and modules for the project to be installed :
- Python 3.x
- Pandas
- Numpy
- Scikit-learn
- NLTK
Install all required packages :
pip install -r requirements.txtHuman activites dataset contain about 5728 record which is a sample of an email
and a target column "spam" which describe the state of an email spam or not.
In this part we will see the project code divided to sections as follows:
-
Section 1 | The Data :
In this section we aim to do some operations on the dataset before training the model on it, processes like :- Data Loading : Load the dataset
- Data Visualization : Visualize dataset features
- Data Cleaning : Remove stopwords and duplicates values
- Data Splitting : Split the dataset into training and testing sets
-
Section 2 | The Model :
The dataset is ready for training, so we create a naive classifier using scikit-learn and thin fit it to the data, and finally we evaluate the model by getting accuracy, classification report and confusion matrix
- Clone the repo
git clone https://github.com/omaarelsherif/Email-Spam-Detection-Using-Machine-Learning.git
- Open 'main.ipynb' in Google Colab or VScode and enjoy
These links may help you to better understanding of the project idea and techniques used :
- Spam detection in machine learning : https://bit.ly/3nwiKtA
- Naive-bayes algorithm : https://bit.ly/3zc9SLH
- Model evaluation : https://bit.ly/3B12VOO
- E-mail : omaarelsherif@gmail.com
- LinkedIn : https://www.linkedin.com/in/omaarelsherif/
- Facebook : https://www.facebook.com/omaarelshereif
