Substack Downloader

A tool to archive Substack newsletters you are currently subscribed to. This allows you to keep an offline copy of the content you have paid for, forever.

Important

This is NOT a piracy tool.

It can only download content you usually have access to.
It does not bypass paywalls for newsletters you are not subscribed to.
Its primary use case is archiving your library before you unsubscribe.

Privacy & Security

100% Local: Your cookies, session data, and downloaded articles are stored only on your computer. Nothing is ever sent to any external server.
Safe: Your credentials are used strictly to authenticate with Substack for downloading your own content.

Features

Personal Archive: Download all posts from a newsletter to your local machine.
Paid Content Support: Authenticates using your existing subscription to archive subscriber-only posts.
Custom Domain Support: Includes a login helper to bypass bot protection on custom domains (e.g., lennysnewsletter.com).
Offline Assets: Downloads images locally so you can view posts without an internet connection.
Markdown Support: Converts posts to Markdown (.md) with local image links, perfect for Obsidian or Notion.
Podcast Skipping: Option to skip podcast/audio episodes (--skip-podcasts).
HTML Export: Saves clean, readable HTML files.

Installation

Clone the repository:

git clone https://github.com/yourusername/substack-scraper.git
cd substack-scraper

Install dependencies:
```
pip install -r requirements.txt
```
Install Playwright browsers: (Required for the login helper)
```
playwright install chromium
```

Authentication

Substack uses complex "bot protection" for some domains. This tool provides a Login Helper (login.py) to make authentication easy.

Method A: Standard Substacks (e.g., `name.substack.com`)

For most newsletters, you only need to log in once.

Run in your terminal:
```
python login.py
```
A Chrome window will open. Log in to substack.com.
Go back to the terminal and press Enter to save your session.
This creates substack_session.json, which works for all standard Substack newsletters.

Method B: Custom Domains (e.g., `robkhenderson.com`, `lennysnewsletter.com`)

Newsletters with their own domains are isolated "islands" and require their own login.

Run the helper with the URL (all on one line):

python login.py https://www.lennysnewsletter.com

A Chrome window will open. Log in to that specific site.
Go back to the terminal and press Enter to save your session.
This saves a domain-specific session (e.g., substack_session_www.lennysnewsletter.com.json) which the scraper will automatically detect and use.

Usage

Basic Scrape (HTML + Markdown + Images):

python scraper.py --url https://read.substack.com

Markdown Only (Best for Obsidian):

python scraper.py --url https://read.substack.com --md-only

Skip Podcasts:

python scraper.py --url https://newsletter.pragmaticengineer.com --skip-podcasts

Limit Number of Posts:

# Download only the 5 most recent posts
python scraper.py --url https://www.robkhenderson.com --limit 5

Output

Downloaded posts are saved in the archive/ directory, organized by domain:

archive/
├── read.substack.com/
│   ├── assets/
│   │   ├── image1.jpg
│   │   └── ...
│   ├── 2023-10-01_some-post-title.md
│   └── 2023-10-01_some-post-title.html
└── ...

Disclaimer

This tool is for personal archiving purposes only. Please respect the copyright of the authors and do not redistribute paid content.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
login.py		login.py
requirements.txt		requirements.txt
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Substack Downloader

Privacy & Security

Features

Installation

Authentication

Method A: Standard Substacks (e.g., `name.substack.com`)

Method B: Custom Domains (e.g., `robkhenderson.com`, `lennysnewsletter.com`)

Usage

Output

Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Substack Downloader

Privacy & Security

Features

Installation

Authentication

Method A: Standard Substacks (e.g., name.substack.com)

Method B: Custom Domains (e.g., robkhenderson.com, lennysnewsletter.com)

Usage

Output

Disclaimer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Method A: Standard Substacks (e.g., `name.substack.com`)

Method B: Custom Domains (e.g., `robkhenderson.com`, `lennysnewsletter.com`)

Packages