An automated dynamic analysis approach to extracting exfiltration URLs from phishing sites and documents.
Analyze a URL:
$ curl "http://localhost:1234/submit" -X POST -d "url=https://pub-20cffe933ea147e7911147f1c88f341b.r2.dev/index.html"
{
"exfiltration": [
{
"body": "",
"credential_types": [
"username"
],
"method": "GET",
"url": "https://dashboard.example.com/web/site/go-back?usr=USERNAME&token=9704A-4FC48-AE885-98DCB-DCDF5-7F3FD-EF-16-81851-875"
}
],
"html": "...",
"solver_html": "..."
}
Alternatively, you can analyze an HTML file:
$ curl "http://localhost:1234/submit" -X POST -F "[email protected]"
- selenium_phishing_detector and their paper A New Heuristic Based Phishing Detection Approach Utilizing Selenium Web-driver for inspiration
- FlareSolverr for getting past hCaptcha
- selenium-wire for intercepting and manipulating requests
- mitmproxy which selenium-wire is built on top of
- undetected_chromedriver to avoid any chromedriver-specific anti-bot protection