Feishu Reader — 飞书文档提取工具

Extract Feishu (Lark) cloud documents to high-quality Markdown. 将飞书云文档转换为高质量 Markdown，保留表格、颜色、删除线、代码块、图片等完整信息。

Output is optimized for AI consumption — text tables over screenshots. 输出面向 AI 消费优化 — 文本表格优于截图。

How It Works / 工作原理

Uses Chrome DevTools Protocol (CDP) to access Feishu's internal data model window.PageMain.blockManager.rootBlockModel. Extracts directly from the block tree — no API keys, no app credentials needed.

通过 CDP 访问飞书页面内部数据模型，直接从 block 树提取文档内容。

Supported content / 支持的内容:

Text, headings (1-9), dividers, quotes, callouts
Ordered/unordered lists (auto-numbered), todo items, nested lists
Native tables + Sheet spreadsheet embeds (with cell styles)
Code blocks (with language), inline code, math formulas
Bold, italic, strikethrough, font color, background color, links
Image download, multi-column layout, iframe, mermaid diagrams

Quick Start / 快速开始

1. Setup / 环境安装

# macOS / Linux
git clone https://github.com/hutiefang76/feishu-reader.git ~/feishu-reader
cd ~/feishu-reader && bash setup.sh

# GitHub 不通时使用 CDN 兜底 / China fallback if GitHub is blocked
curl -fSL http://dl.hutiefang.com/feishu-reader-latest.tar.gz | tar xz
cd feishu-reader && bash setup.sh

# Windows
setup.bat

Auto-detects and installs: Python 3.8+ → virtual environment → websocket-client → Chrome. Prompts before installing anything. Supports China mirror fallback (Tsinghua, Aliyun, CDN).

自动检测安装：Python → 虚拟环境 → 依赖 → Chrome。安装前会询问确认，支持国内镜像降级（清华、阿里云、CDN 兜底）。

2. Login / 登录飞书

.venv/bin/python3 extract_feishu.py login

Scan QR code or enter credentials in the browser. Session auto-saves to local cache. 在浏览器中扫码或输入账号密码，Session 自动保存。

3. Extract / 提取文档

# Single document / 单个文档
.venv/bin/python3 feishu_skill.py extract "https://xxx.feishu.cn/docx/xxx"

# Batch extract / 批量提取
.venv/bin/python3 feishu_skill.py batch "url1" "url2" "url3"

# Specify output / 指定输出路径
.venv/bin/python3 feishu_skill.py extract "https://xxx.feishu.cn/docx/xxx" -o my_doc.md

Output saved to output/ directory by default.

4. Browse / 查阅文档

.venv/bin/python3 feishu_skill.py list              # List extracted docs / 列出文档
.venv/bin/python3 feishu_skill.py search "keyword"   # Search content / 搜索内容
.venv/bin/python3 feishu_skill.py read "file.md"     # Read document / 读取文档
.venv/bin/python3 feishu_skill.py status             # Check environment / 环境检查

AI Integration / AI 集成

Claude Code

# 1. Clone & setup / 克隆并安装
git clone https://github.com/hutiefang76/feishu-reader.git ~/feishu-reader
cd ~/feishu-reader && bash setup.sh

# 2. Install skill globally / 全局安装 Skill
mkdir -p ~/.claude/skills/feishu-extract
cp ~/feishu-reader/.claude/skills/feishu-extract/SKILL.md ~/.claude/skills/feishu-extract/

After installation, the skill is available in any project directory. Ask Claude to extract a Feishu document or use /feishu-extract <url>.

安装后在任意项目目录下均可使用，直接让 Claude 提取飞书文档或使用 /feishu-extract <url>。

Kiro

Kiro Skill at .kiro/skills/feishu-extract.md. Type #feishu-extract in Kiro chat to use.

Cursor / Windsurf / Other AI IDEs

Add to your AI IDE knowledge base (e.g. CLAUDE.md, .cursorrules):

## Feishu Document Extraction

Commands (run from feishu-reader directory):
- Status:  .venv/bin/python3 feishu_skill.py status
- Extract: .venv/bin/python3 feishu_skill.py extract "<feishu_url>"
- Batch:   .venv/bin/python3 feishu_skill.py batch "<url1>" "<url2>"
- List:    .venv/bin/python3 feishu_skill.py list
- Search:  .venv/bin/python3 feishu_skill.py search "<keyword>"
- Read:    .venv/bin/python3 feishu_skill.py read "<file_path>"

All commands return JSON. Requires Chrome running + Feishu login.

MCP Server (optional / 可选)

.venv/bin/python3 feishu_skill.py mcp

HTTP API (optional / 可选)

.venv/bin/python3 feishu_skill.py serve --port 8900

File Structure / 文件结构

feishu_skill.py     — Skill layer: CLI + MCP Server + HTTP API
feishu_cdp.py       — CDP core: PageMain block tree → Markdown
feishu_common.py    — Shared: CDP communication, Chrome, Cookie/Session
extract_feishu.py   — Main entry script
setup.sh / setup.bat — Environment setup (auto-install Python/Chrome/deps)
requirements.txt    — Python dependency (websocket-client only)
output/             — Extracted documents
.claude/skills/       — Claude Code Skill definition
.kiro/skills/        — Kiro AI Skill definition

Requirements / 系统要求

Python 3.8+ (auto-installed by setup)
Google Chrome (auto-installed by setup)
macOS / Windows / Linux
Only pip dependency: websocket-client

Notes / 注意事项

Chrome runs in CDP debug mode (port 9222), setup.sh auto-configures
First use requires Feishu login, session cached at ~/.cache/feishu-reader/cookies.json
URLs must be quoted in zsh to prevent glob expansion
pip install supports China mirror auto-fallback (Tsinghua, Aliyun)
setup.sh download chain: Google/PyPI → China mirrors → CDN dl.hutiefang.com

CDN Fallback / 国内下载兜底

All download dependencies have CDN fallback via dl.hutiefang.com (Qiniu Cloud):

Resource	Primary	Fallback
Source code	GitHub	`http://dl.hutiefang.com/feishu-reader-latest.tar.gz`
Chrome (Linux)	dl.google.com	CDN → `apt install chromium-browser`
websocket-client	PyPI → Tsinghua → Aliyun	`http://dl.hutiefang.com/websocket_client-1.9.0-py3-none-any.whl`

CDN maintenance / CDN 维护:

# Update tarball after release / 发版后更新
git archive --format=tar.gz --prefix=feishu-reader/ -o /tmp/feishu-reader-latest.tar.gz HEAD
qshell fput feishu-reader feishu-reader-latest.tar.gz /tmp/feishu-reader-latest.tar.gz --overwrite

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Feishu Reader — 飞书文档提取工具

How It Works / 工作原理

Quick Start / 快速开始

1. Setup / 环境安装

2. Login / 登录飞书

3. Extract / 提取文档

4. Browse / 查阅文档

AI Integration / AI 集成

Claude Code

Kiro

Cursor / Windsurf / Other AI IDEs

MCP Server (optional / 可选)

HTTP API (optional / 可选)

File Structure / 文件结构

Requirements / 系统要求

Notes / 注意事项

CDN Fallback / 国内下载兜底

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.claude/skills/feishu-extract		.claude/skills/feishu-extract
.kiro		.kiro
core		core
docs		docs
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
TODO.md		TODO.md
extract_feishu.py		extract_feishu.py
feishu_api.py		feishu_api.py
feishu_cdp.py		feishu_cdp.py
feishu_common.py		feishu_common.py
feishu_ocr.py		feishu_ocr.py
feishu_skill.py		feishu_skill.py
requirements.txt		requirements.txt
run.bat		run.bat
run.sh		run.sh
setup.bat		setup.bat
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

Feishu Reader — 飞书文档提取工具

How It Works / 工作原理

Quick Start / 快速开始

1. Setup / 环境安装

2. Login / 登录飞书

3. Extract / 提取文档

4. Browse / 查阅文档

AI Integration / AI 集成

Claude Code

Kiro

Cursor / Windsurf / Other AI IDEs

MCP Server (optional / 可选)

HTTP API (optional / 可选)

File Structure / 文件结构

Requirements / 系统要求

Notes / 注意事项

CDN Fallback / 国内下载兜底

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages