Skip to content

monish4030/phishing-detection-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ Phishing Detection Tool

Python License Purpose ML

A comprehensive, CLI-based URL phishing analyzer built for cybersecurity education.

Made by Monish Paramasivam


πŸ“Œ Overview

The Phishing Detection Tool is a Python-based command-line application designed to analyze URLs for phishing indicators using multiple detection techniques β€” including URL structure analysis, SSL validation, WHOIS domain intelligence, blacklist matching, and a machine learning classifier.

⚠️ Disclaimer: This tool is built strictly for educational and ethical use only. It is a cybersecurity portfolio project intended to demonstrate phishing detection concepts. Do NOT use it for any unauthorized or illegal activities.


πŸš€ Features

Core Detection

Feature Description
πŸ” URL Structure Analysis Detects typosquatting, homograph attacks, keyword stuffing, suspicious TLDs, IP addresses, @-symbol abuse, and more
πŸ”’ SSL / HTTPS Validation Checks for HTTPS, validates certificate, detects self-signed or expired certs
🌐 WHOIS Domain Intelligence Retrieves domain age, registrar, registrant info β€” newly registered domains raise red flags
πŸ“‹ Blacklist / Whitelist Custom lists for known phishing domains and trusted sites, persisted as JSON
πŸ€– ML Classifier Random Forest model trained on 15 URL features to predict phishing probability
πŸ“Š Risk Scoring Aggregated 0–100 risk score with Low / Medium / High classification
πŸ’Ύ Report Saving Save full analysis reports to timestamped .txt files

Advanced Features

  • Batch URL analysis β€” analyze multiple URLs in one session
  • Probability bar β€” visual indicator of phishing likelihood
  • Feature breakdown β€” see which ML features contributed most
  • Color-coded terminal UI β€” intuitive, severity-highlighted output
  • Persistent custom lists β€” blacklist/whitelist saved between sessions

πŸ“ Project Structure

phishing-detection-tool/
β”‚
β”œβ”€β”€ main.py                    # CLI entry point, menu system
β”‚
β”œβ”€β”€ core/                      # Core analysis modules
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ analyzer.py            # URL structure analysis (12+ checks)
β”‚   β”œβ”€β”€ ssl_checker.py         # SSL/HTTPS certificate validation
β”‚   β”œβ”€β”€ whois_lookup.py        # WHOIS / domain age lookup
β”‚   └── blacklist.py           # Blacklist & whitelist management
β”‚
β”œβ”€β”€ ml/                        # Machine learning module
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ classifier.py          # Random Forest phishing classifier
β”‚   β”œβ”€β”€ phishing_model.pkl     # Saved trained model (auto-generated)
β”‚   └── scaler.pkl             # Feature scaler (auto-generated)
β”‚
β”œβ”€β”€ utils/                     # Utility modules
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ display.py             # Colored terminal report renderer
β”‚   └── report.py              # Report file generator
β”‚
β”œβ”€β”€ data/                      # Persistent data
β”‚   β”œβ”€β”€ blacklist.json         # Custom blacklist (auto-created)
β”‚   └── whitelist.json         # Custom whitelist (auto-created)
β”‚
β”œβ”€β”€ tests/                     # Unit tests
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── test_analyzer.py       # Test suite (pytest-compatible)
β”‚
β”œβ”€β”€ reports/                   # Saved analysis reports (auto-created)
β”‚
β”œβ”€β”€ sample_urls.txt            # Test URLs for demonstration
β”œβ”€β”€ requirements.txt           # Python dependencies
└── README.md                  # This file

βš™οΈ Installation

Prerequisites

  • Python 3.8 or higher
  • pip (Python package manager)

Steps

# 1. Clone the repository
git clone https://github.com/monishparamasivam/phishing-detection-tool.git
cd phishing-detection-tool

# 2. (Recommended) Create a virtual environment
python -m venv venv
source venv/bin/activate        # Linux/macOS
venv\Scripts\activate           # Windows

# 3. Install dependencies
pip install -r requirements.txt

# 4. Run the tool
python main.py

πŸ–₯️ Usage

Starting the Tool

python main.py

Main Menu Options

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           MAIN MENU                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  [1] Analyze a Single URL               β”‚
β”‚  [2] Analyze Multiple URLs (Batch)      β”‚
β”‚  [3] Manage Blacklist / Whitelist       β”‚
β”‚  [4] View Sample Test URLs              β”‚
β”‚  [5] About This Tool                    β”‚
β”‚  [0] Exit                               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Single URL Analysis Example

[?] Enter URL to analyze: http://paypa1-secure-login.com/verify

[1/5] Analyzing URL structure...
[2/5] Checking SSL/HTTPS...
[3/5] Fetching domain information...
[4/5] Checking blacklist/whitelist...
[5/5] Running ML classifier...

═══════════════════════════════════════════════════════════════════
  PHISHING ANALYSIS REPORT
═══════════════════════════════════════════════════════════════════

[TARGET URL]
  http://paypa1-secure-login.com/verify

β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
β–ˆ                                                              β–ˆ
β–ˆ    🚨  RISK LEVEL: HIGH  |  SCORE: 87/100                   β–ˆ
β–ˆ                                                              β–ˆ
β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ

[SCORE BREAKDOWN]
  β€’ URL structure issues: +35 pts
  β€’ No HTTPS: +15 pts
  β€’ ML classifier risk: +22 pts
  β€’ Domain is only 12 days old: +10 pts
  ...

Run Unit Tests

# Using pytest
pytest tests/ -v

# Or directly
python tests/test_analyzer.py

πŸ”¬ How It Works

Detection Pipeline

URL Input
    β”‚
    β”œβ”€β”€β–Ά [1] URL Analyzer      β€” 12+ structural checks
    β”‚         β€’ Length, IP, subdomains, keywords, homographs,
    β”‚           special chars, TLD, brand impersonation...
    β”‚
    β”œβ”€β”€β–Ά [2] SSL Checker       β€” HTTPS & certificate validation
    β”‚         β€’ Protocol check, cert validity, expiry, issuer,
    β”‚           self-signed detection...
    β”‚
    β”œβ”€β”€β–Ά [3] WHOIS Lookup      β€” Domain intelligence
    β”‚         β€’ Registration date, domain age, registrar,
    β”‚           privacy protection...
    β”‚
    β”œβ”€β”€β–Ά [4] Blacklist Check   β€” List-based matching
    β”‚         β€’ Custom blacklist, custom whitelist,
    β”‚           domain normalization...
    β”‚
    β”œβ”€β”€β–Ά [5] ML Classifier     β€” Random Forest prediction
    β”‚         β€’ 15 engineered features, probability score,
    β”‚           confidence level...
    β”‚
    └──▢ Final Aggregation β†’ Risk Score (0-100) β†’ LOW / MEDIUM / HIGH

Risk Score Calculation

Score Range Risk Level Action
0 – 30 βœ… LOW Generally safe
31 – 65 ⚠️ MEDIUM Exercise caution
66 – 100 🚨 HIGH Likely phishing

ML Features Used

The classifier uses 15 engineered features:

  1. URL total length
  2. IP address in URL
  3. Subdomain depth count
  4. Suspicious keyword count
  5. HTTPS presence
  6. @ symbol presence
  7. Hyphen count in domain
  8. Dot count in domain
  9. URL shortener presence
  10. Suspicious TLD usage
  11. Digit count in domain
  12. URL path depth
  13. Brand name spoofing indicator
  14. Special character count
  15. Domain Shannon entropy

πŸ“¦ Dependencies

Package Purpose
requests HTTP connection for SSL checks
python-whois WHOIS domain data retrieval
scikit-learn Random Forest ML classifier
numpy Numerical feature processing
joblib Model serialization
colorama Cross-platform colored CLI output
pytest Unit test framework

πŸ§ͺ Sample Test URLs

URL Expected Result
https://google.com βœ… LOW risk
https://github.com βœ… LOW risk
http://paypa1-secure-login.com 🚨 HIGH risk
http://192.168.1.1/bank/login 🚨 HIGH risk
https://microsoft.com.login.evil.ru 🚨 HIGH risk
http://bit.ly/free-offer ⚠️ MEDIUM risk

See sample_urls.txt for the full test list.


πŸ” Ethical Use Statement

This tool was developed for:

  • Learning phishing detection techniques
  • Cybersecurity portfolio demonstration
  • Understanding URL analysis and threat intelligence
  • Educational research into social engineering defenses

This tool must NOT be used to:

  • Target real individuals or organizations
  • Conduct unauthorized security testing
  • Facilitate any form of cybercrime

🀝 Contributing

Contributions are welcome for educational improvements:

  1. Fork the repo
  2. Create a feature branch (git checkout -b feature/add-virustotal-api)
  3. Commit changes (git commit -m 'Add VirusTotal API integration')
  4. Push to branch (git push origin feature/add-virustotal-api)
  5. Open a Pull Request

Ideas for Extension

  • VirusTotal API integration
  • Google Safe Browsing API check
  • Web scraping for page content analysis
  • Browser extension version
  • REST API wrapper (Flask/FastAPI)
  • GUI interface (Tkinter or PyQt)
  • Train on larger datasets (PhishTank, UCI)

πŸ“„ License

License


πŸ‘¨β€πŸ’» Author

Monish Paramasivam

  • Cybersecurity Enthusiast & Python Developer
  • Portfolio Project β€” Phishing Detection Tool v1.0

"Security is not a product, but a process." β€” Bruce Schneier

⭐ Star this repo if it helped your learning journey!

About

The 🎣 Phishing Detection Tool πŸ›‘ is a Python-based command-line application designed to analyze URLs for phishing indicators using multiple detection techniques β€” including URL structure analysis, SSL validation, WHOIS domain intelligence, blacklist matching, and a machine learning classifier. πŸ›‘ THIS WILL PROTECT YOU FROM HACKERS πŸ›‘

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages