Extract Feishu (Lark) cloud documents to high-quality Markdown. 将飞书云文档转换为高质量 Markdown,保留表格、颜色、删除线、代码块、图片等完整信息。
Output is optimized for AI consumption — text tables over screenshots. 输出面向 AI 消费优化 — 文本表格优于截图。
Uses Chrome DevTools Protocol (CDP) to access Feishu's internal data model window.PageMain.blockManager.rootBlockModel. Extracts directly from the block tree — no API keys, no app credentials needed.
通过 CDP 访问飞书页面内部数据模型,直接从 block 树提取文档内容。
Supported content / 支持的内容:
- Text, headings (1-9), dividers, quotes, callouts
- Ordered/unordered lists (auto-numbered), todo items, nested lists
- Native tables + Sheet spreadsheet embeds (with cell styles)
- Code blocks (with language), inline code, math formulas
- Bold, italic, strikethrough, font color, background color, links
- Image download, multi-column layout, iframe, mermaid diagrams
# macOS / Linux
git clone https://github.com/hutiefang76/feishu-reader.git ~/feishu-reader
cd ~/feishu-reader && bash setup.sh
# GitHub 不通时使用 CDN 兜底 / China fallback if GitHub is blocked
curl -fSL http://dl.hutiefang.com/feishu-reader-latest.tar.gz | tar xz
cd feishu-reader && bash setup.sh
# Windows
setup.batAuto-detects and installs: Python 3.8+ → virtual environment → websocket-client → Chrome.
Prompts before installing anything. Supports China mirror fallback (Tsinghua, Aliyun, CDN).
自动检测安装:Python → 虚拟环境 → 依赖 → Chrome。安装前会询问确认,支持国内镜像降级(清华、阿里云、CDN 兜底)。
.venv/bin/python3 extract_feishu.py loginScan QR code or enter credentials in the browser. Session auto-saves to local cache. 在浏览器中扫码或输入账号密码,Session 自动保存。
# Single document / 单个文档
.venv/bin/python3 feishu_skill.py extract "https://xxx.feishu.cn/docx/xxx"
# Batch extract / 批量提取
.venv/bin/python3 feishu_skill.py batch "url1" "url2" "url3"
# Specify output / 指定输出路径
.venv/bin/python3 feishu_skill.py extract "https://xxx.feishu.cn/docx/xxx" -o my_doc.mdOutput saved to output/ directory by default.
.venv/bin/python3 feishu_skill.py list # List extracted docs / 列出文档
.venv/bin/python3 feishu_skill.py search "keyword" # Search content / 搜索内容
.venv/bin/python3 feishu_skill.py read "file.md" # Read document / 读取文档
.venv/bin/python3 feishu_skill.py status # Check environment / 环境检查# 1. Clone & setup / 克隆并安装
git clone https://github.com/hutiefang76/feishu-reader.git ~/feishu-reader
cd ~/feishu-reader && bash setup.sh
# 2. Install skill globally / 全局安装 Skill
mkdir -p ~/.claude/skills/feishu-extract
cp ~/feishu-reader/.claude/skills/feishu-extract/SKILL.md ~/.claude/skills/feishu-extract/After installation, the skill is available in any project directory. Ask Claude to extract a Feishu document or use /feishu-extract <url>.
安装后在任意项目目录下均可使用,直接让 Claude 提取飞书文档或使用 /feishu-extract <url>。
Kiro Skill at .kiro/skills/feishu-extract.md. Type #feishu-extract in Kiro chat to use.
Add to your AI IDE knowledge base (e.g. CLAUDE.md, .cursorrules):
## Feishu Document Extraction
Commands (run from feishu-reader directory):
- Status: .venv/bin/python3 feishu_skill.py status
- Extract: .venv/bin/python3 feishu_skill.py extract "<feishu_url>"
- Batch: .venv/bin/python3 feishu_skill.py batch "<url1>" "<url2>"
- List: .venv/bin/python3 feishu_skill.py list
- Search: .venv/bin/python3 feishu_skill.py search "<keyword>"
- Read: .venv/bin/python3 feishu_skill.py read "<file_path>"
All commands return JSON. Requires Chrome running + Feishu login.
.venv/bin/python3 feishu_skill.py mcp.venv/bin/python3 feishu_skill.py serve --port 8900feishu_skill.py — Skill layer: CLI + MCP Server + HTTP API
feishu_cdp.py — CDP core: PageMain block tree → Markdown
feishu_common.py — Shared: CDP communication, Chrome, Cookie/Session
extract_feishu.py — Main entry script
setup.sh / setup.bat — Environment setup (auto-install Python/Chrome/deps)
requirements.txt — Python dependency (websocket-client only)
output/ — Extracted documents
.claude/skills/ — Claude Code Skill definition
.kiro/skills/ — Kiro AI Skill definition
- Python 3.8+ (auto-installed by setup)
- Google Chrome (auto-installed by setup)
- macOS / Windows / Linux
- Only pip dependency:
websocket-client
- Chrome runs in CDP debug mode (port 9222),
setup.shauto-configures - First use requires Feishu login, session cached at
~/.cache/feishu-reader/cookies.json - URLs must be quoted in zsh to prevent glob expansion
- pip install supports China mirror auto-fallback (Tsinghua, Aliyun)
- setup.sh download chain: Google/PyPI → China mirrors → CDN
dl.hutiefang.com
All download dependencies have CDN fallback via dl.hutiefang.com (Qiniu Cloud):
| Resource | Primary | Fallback |
|---|---|---|
| Source code | GitHub | http://dl.hutiefang.com/feishu-reader-latest.tar.gz |
| Chrome (Linux) | dl.google.com | CDN → apt install chromium-browser |
| websocket-client | PyPI → Tsinghua → Aliyun | http://dl.hutiefang.com/websocket_client-1.9.0-py3-none-any.whl |
CDN maintenance / CDN 维护:
# Update tarball after release / 发版后更新
git archive --format=tar.gz --prefix=feishu-reader/ -o /tmp/feishu-reader-latest.tar.gz HEAD
qshell fput feishu-reader feishu-reader-latest.tar.gz /tmp/feishu-reader-latest.tar.gz --overwriteMIT