Spiderlock

Spiderlock is a Python web crawler designed for cybersecurity enthusiasts, pentesters, and web analysts. It supports both breadth-first (BFS) and depth-first (DFS) crawling strategies and can visualize website structures as a 2D graph, showing pages and their connections.

Badges

Features

BFS & DFS crawling – Choose the strategy based on your analysis needs.
Robots.txt aware – Automatically respects crawling rules.
Sitemap generation – Builds a clear map of website pages and their links.
Export to JSON – Save crawl graph and results in JSON format.
External links handling – Show and analyze external links.
SEO audit – Optional SEO analysis of pages.
Customizable depth & delay – Control crawl depth and request delay.
Modular & Pythonic – Easy to extend or integrate into pentesting workflows.

Installation

Clone the repository:

git clone https://github.com/sherlock2215/SpiderLock.git
cd SpiderLock

Install dependencies in a virtual environment:

python -m venv venv
source venv/bin/activate   # Linux/macOS
venv\Scripts\activate      # Windows
pip install -r requirements.txt

Usage

Run the crawler from the command line after activating the virtual environment:

python crawler.py -w https://example.com [options]

Available Options

Flag	Description
`-b, --bfs`	Use BFS crawling strategy
`-d, --dfs`	Use DFS crawling strategy
`-w, --web`	Required. Starting webpage URL
`-s, --summary`	Show crawl summary
`--seo`	Run SEO audit
`-e, --ext`	Show external links
`-t, --top N`	Show top N pages by links (default 10)
`-j, --json FILE`	Save crawl graph to JSON file
`-de,--depth N`	Set max crawl depth (default 2)
`-q, --quick`	Quick crawl (shallow depth, e.g., depth 1)

Example Commands

Basic BFS crawl with summary:

python crawler.py -w https://example.com -b -s

DFS crawl, export to JSON, and show external links:

python crawler.py -w https://example.com -d -j output.json -e

Quick crawl with top 5 pages:

python crawler.py -w https://example.com -b -q -t 5

Example Output

Starting crawl on https://example.com
Strategy: BFS | Max Depth: 1

Visited Pages:
[0] https://example.com
[1] https://www.iana.org/domains/example

Crawl complete. Total pages visited: 2

The 2D graph visualization shows pages as nodes and links as edges.

Screenshots / Demo

Future Features

Multi-threaded crawling for speed
Browser simulation (JavaScript support)

Contributing

Contributions are welcome! Please open an issue or pull request for bug fixes, features, or improvements.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
images		images
CrawlQueue.py		CrawlQueue.py
LICENSE.txt		LICENSE.txt
PageParser.py		PageParser.py
README.md		README.md
RobotsHandler.py		RobotsHandler.py
WebCrawler.py		WebCrawler.py
main.py		main.py
requirements.txt		requirements.txt
siteMap.py		siteMap.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spiderlock

Badges

Features

Installation

Usage

Available Options

Example Commands

Example Output

Screenshots / Demo

Future Features

Contributing

About

Uh oh!

Releases 1

Packages

Languages

License

sherlock2215/SpiderLock

Folders and files

Latest commit

History

Repository files navigation

Spiderlock

Badges

Features

Installation

Usage

Available Options

Example Commands

Example Output

Screenshots / Demo

Future Features

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages