English · 简体中文
A configurable single-surface workbench framework for focused, long-running creative work.
Answer 7 questions → matched extension → live config preview → apply.
You answer 7 questions about your domain. The framework generates a workbench tailored to it — advisor panel, knowledge taxonomy, prompt tone, surface layout — and you can keep customizing from there. Domain content is plug-and-play via extension folders. The framework ships two extensions today (AI research, biology research); writing your own is one folder of YAML.
The reference instance — ProjectScribe — is the maintainer's AI co-founder daily-driver, used to build WorkspaceOS itself.
Phase 1 = content extensions (personas, taxonomies, prompts). Phase 2 = capability extensions (Gmail / Calendar / Slack ingest). Phase 2 schema is already reserved so manifests authored today stay forward-compatible.
A bench with six opt-in surfaces, each driven by your domain config:
| Letter | Surface | What it does |
|---|---|---|
| A | Advisor | Chat with a cofounder advisor panel. 3–4 advisors weigh in per message. |
| R | Research | Parallel critique from a research reviewer panel. 5–6 reviewers, distinct lenses. |
| D | Drafts | Blog and social drafts (per-project, paginated). |
| P | Papers | Research papers — single + portfolio. Multi-agent v2 pipeline. |
| K | Knowledge | Cross-project graph of decisions / claims / hypotheses extracted from chat. |
| W | Worklog | Weekly / monthly / quarterly progress reports. |
Plus a ⌘K command palette, slide-in project inspector, and a
right-side TUI log streaming every AI call, sync, and extraction in
real time.
Prerequisite: Docker. Install
Docker Desktop (macOS /
Windows / Linux) or OrbStack (faster on macOS).
Make sure docker compose works in your terminal before continuing.
git clone https://github.com/Chesterguan/WorkspaceOS.git
cd WorkspaceOS
cp .env.example .env
# Edit .env — minimum required is GEMINI_API_KEY. Everything else has
# defaults that work for local development.
docker compose up --build -d
# Bench: http://localhost:4000
# Backend API: http://localhost:9000/docsFirst load redirects you to /login. Register an account, then
/onboarding walks you through 7 questions and generates a workbench.
You can skip the wizard and use the default config at any time.
An extension is a single folder under config/extensions/<id>/:
config/extensions/bio-research/
├── manifest.yaml # match rules + version + path refs
├── personas/
│ ├── cofounder.yaml # 3–4 cofounder personas
│ └── research.yaml # 5–6 research reviewers
├── taxonomies/extra.yaml # node types added to the base 7
└── prompts/worklog/
├── weekly.txt
├── monthly.txt
└── quarterly.txt
manifest.yaml is just YAML — no Python, no JS, no build step:
id: bio-research
name: Bio Research
description: Persona panel + taxonomy for wet-lab biology and biofoundry.
version: 0.1.0
author: workspaceos
matches:
domain_keywords: [bio, biotech, biofoundry, synthetic biology, strain, crispr]
audience_any: [peer_researchers]
outputs_any: [papers]
personas:
cofounder: ./personas/cofounder.yaml
research: ./personas/research.yaml
taxonomy_extra: ./taxonomies/extra.yaml
worklog_templates:
weekly: ./prompts/worklog/weekly.txt
monthly: ./prompts/worklog/monthly.txt
quarterly: ./prompts/worklog/quarterly.txtAdding a new extension is one folder drop:
cp -r config/extensions/bio-research config/extensions/your-domain- Edit
manifest.yaml— changeid,name,matches.domain_keywords - Rewrite the persona / taxonomy / prompt files for your domain
- Restart the backend (
docker compose restart backend)
The wizard's matcher scores each extension against the user's answers:
domain_keywordssubstring hit = +2 eachaudience_anyoverlap = +1 eachoutputs_anyoverlap = +1 each
Threshold is 2. Highest-scoring extension above threshold wins. No match → falls back to Gemini synthesis → falls back to a deterministic bucket stub.
See CONTRIBUTING.md for the full extension authoring guide.
-
User answers 7 questions at
/onboarding. Domain (free text), primary outputs, audience, dream advisor panel, what you track, cadence, stage. -
Backend matches extensions. Scores each shipped extension's
matchesrules against the answers. -
Generator builds the config:
- If an extension matches → splice its bundled files verbatim, emit "Matched extension: X (score N)" event.
- Else if
GEMINI_API_KEYis set → one LLM call returns personas + taxonomy additions + tagline. - Else → deterministic bucket stub (CS / biology / economics).
-
SSE streams progress captions to the wizard's wait animation (5-chapter SVG tutorial loops independently). Same events also flow into the bench's right-side TUI log so the user can see what ran after they navigate back.
-
Preview pane shows generated personas, taxonomy chips, worklog template sample, raw YAML disclosure. Apply writes files into
config/, triggers a live reload, and marks the user as onboarded. Regenerate re-rolls.
Total wall-clock: ~15s for extension match, ~10s for Gemini, instant for the bucket stub.
- Frontend — Next.js 16 (App Router, Suspense,
proxy.tsmiddleware), Tailwind v4, shadcn/ui, motion (Framer), React Flow + dagre for the knowledge graph. Port 4000. - Backend — FastAPI (async), PostgreSQL 15 + pgvector (768-dim IVFFlat), Server-Sent Events for the bench log + wizard generation. Port 9000.
- AI — Hybrid. Local Ollama (
nomic-embed-text) for embeddings when available; Gemini for generation + long-tail wizard fallback; OpenAI for paper roundtable reviewers (optional). - Deployment — Docker Compose, three services (
db,backend,frontend) on theworkspaceosnetwork. Auth: JWT for users,X-API-Keyfor scripts and SSE query-param.
| Required | Optional |
|---|---|
| Gemini API key — chat / drafts / papers / extraction / embeddings-fallback. Free tier works for testing. | OpenAI key — only used by the paper roundtable reviewers. Papers still generate without it. |
| Ollama running locally — free local embeddings. Falls back to Gemini if absent. | |
| GitHub token — repo sync, deep repo context, release publishing. | |
| LinkedIn / Dev.to / Hashnode keys — multi-platform publishing. |
All API keys can be set at runtime through the Settings page
(Fernet-encrypted in the DB) instead of .env.
By default WorkspaceOS sends generation/review prompts to Gemini (or
whichever CLOUD_AI_PROVIDER you pick) and uses Ollama for embeddings
if you've installed it. If you want everything local — no cloud
calls for any AI operation — install Ollama and pull a chat model
sized to your hardware.
Install:
# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh
# or download the installer from https://ollama.com/download
ollama serve # leave running in a terminalPull recommended models:
# Embeddings — small, fast, used for all semantic search (~274 MB)
ollama pull nomic-embed-text
# Chat — pick ONE based on how much RAM you have:
ollama pull qwen2.5:3b # 4–8 GB RAM, fast, decent for extraction
ollama pull qwen2.5:7b # 16 GB RAM (recommended default)
ollama pull qwen2.5:14b # 32 GB RAM, near cloud-quality for chat
ollama pull llama3.3:70b # 64 GB RAM + GPU, full cloud-replacementqwen2.5 is recommended for structured-output tasks (extraction,
classification) — it follows JSON-shaped prompts more reliably than
similarly-sized Llama variants. For long-form writing (drafts,
papers), qwen2.5:14b and up are competitive; below that the cloud
provider noticeably wins.
Point WorkspaceOS at it — set in .env:
# macOS / Windows Docker — host.docker.internal resolves to the host.
OLLAMA_BASE_URL=http://host.docker.internal:11434
# Linux — host.docker.internal doesn't resolve by default. Either use
# your host's LAN IP (typically 172.17.0.1 for the default bridge), or
# add `--add-host=host.docker.internal:host-gateway` to the backend
# service in docker-compose.yml and keep the line above.
# OLLAMA_BASE_URL=http://172.17.0.1:11434
OLLAMA_CHAT_MODEL=qwen2.5:7b # match what you pulled
OLLAMA_EMBED_MODEL=nomic-embed-text
# Stay local end-to-end (otherwise only embeddings/extraction are local):
LOCAL_AI_PROVIDER=ollama
CLOUD_AI_PROVIDER=ollamaTrade-offs to know:
- Quality: small Ollama models (≤7B) write noticeably blander
drafts than Gemini/Claude. Use
qwen2.5:14bor larger if you care about prose quality on local-only. - Latency: a 7B model on an M2 Mac generates ~30–60 tokens/sec; a chat reply takes seconds, paper review takes minutes. Cloud is usually faster.
- First load: each model loads into RAM on first use of a session
(~10–30 seconds). Keep
ollama serverunning to avoid cold starts. - GPU: not strictly required, but
qwen2.5:14band up are painfully slow without one. M-series Macs use the integrated GPU automatically.
Beyond content (personas / taxonomies / prompts), extensions ship
capabilities — runtime hooks that pull data, add palette entries,
and put context buttons on items. Capability code lives in the
framework (backend/app/capabilities/), registered by name. Manifests
declare which runners to enable.
Three capability kinds are runtime-active today:
ingest_source— async runner polled on a schedule. Emits bench events + inserts knowledge graph nodes. Example:local-files-watcherwatches a directory every 30s and creates afile_ingestednode per new file.slash_command— palette entry (⌘K). Two handler kinds:api_calltriggers a registered backend runner;navigatepushes a route. Example: "Scan local files now" (api_call) and "Open knowledge graph" (navigate).action_button— contextual button rendered on a target item (currentlyknowledge_node).visible_whenfilters by the item's fields. Example: "Mark as decision" shows on claim/hypothesis/ insight/question nodes; "Archive" shows on all nodes.
WorkspaceOS is the secretary, not the replacement. Benchling, Zotero, Mail, etc. stay where they are — capabilities just pull breadcrumbs into the knowledge graph so the bench has context across your tools.
Shipped extensions:
local-files-watcher—ingest_source: local_files. Walks a directory underWORKSPACE_HOST_PATH, dedups by mtime+size, caps 100 files/tick, skips dot-dirs +node_modules+.git.macos-mail—ingest_source: macos_mail. Host-side AppleScript bridge viascripts/outlook_bridge/install.sh. Reads Apple Mail + Outlook for Mac, POSTs to/skills/local-ingest/items. No in-container code because Mail.app isn't accessible from Docker.benchling—ingest_source: benchling_import. Pulls recent notebook entries from your Benchling tenant every 6h asbenchling_entryknowledge nodes. Title + author + date + link. Body content stays in Benchling. Setapi_key+tenantin the extension manifest to enable.zotero—ingest_source: zotero_sync. Pulls top-level library items every 6h aspaper_referenceknowledge nodes. Title- first author + year + DOI + venue. Set
api_key+library_id library_typeto enable.
- first author + year + DOI + venue. Set
bench-extras— utility pack: 2 slash commands + 2 action buttons. Use as the working example when authoring your own.
Added in v0.2.6 (bio-researcher build) — alongside a
Knowledge-surface upgrade that links Experiment nodes to the
Claim / Construct nodes they support, so the graph reflects the
actual evidence trail:
preprints—ingest_source: preprint_ingest. Polls bioRxiv (and optionally medRxiv) daily for keyword-matching preprints and adds them aspaper_referencenodes — same type as Zotero items, so the research panel and paper pipeline pick them up automatically. Opt-in: empty keyword list = nothing imported.ot2-protocols—ingest_source: ot2_protocols. Walks a directory of Opentrons OT-2 Python protocols and creates oneprotocolnode per file (name, author, API level, labware, source excerpt). Re-ingested on content change.methods-drafter—slash_command: draft_methods. Drafts a publication-style Methods section grounded in the project's actual knowledge graph — experiments, constructs (Benchling), strains, OT-2 protocols, custom GitHub tools. Default styleplant_synbio; switch via the slash payload. Saved as a Papers-surface draft taggedpaper+methods_draft.github-tools—ingest_source: github_user_tools. Syncs your public GitHub repos once a day astoolnodes (name, description, language, stars, truncated README). Citable by/draft_methodswhen writing Methods. PAT optional but lifts the rate limit from 60 to 5000 req/h.
| Where | Shows |
|---|---|
| Settings → Capabilities tab | Every declared capability grouped by kind. Each ingest source has a Configure button → modal form with field-level help, auto-fill, and a test-connection check. Encrypted at rest. No YAML editing, no restart. Link to the docs setup guide right next to each row. |
| ⌘K command palette | Slash commands appear inline with built-in entries. Type to filter; click to fire. |
| In context | Action buttons render on the item they target — e.g. an "Extension actions" row on the knowledge node detail panel. Gated by visible_when so menus stay clean. |
- Open Settings → Capabilities.
- Find the ingest source you want (e.g. Zotero). Click configure.
- Modal shows the fields with inline help. Paste your API key.
- Click Auto-fill (where supported — e.g. Zotero introspects the
key and fills
library_id+library_typeautomatically). - Click Test — the runner fires once and reports success / the specific error if anything's off.
- Click Save — the overlay is Fernet-encrypted in the DB. The next poll tick uses the new config; no restart.
For technical users the YAML path still works: edit
config/extensions/<id>/manifest.yaml and restart. DB overlay wins
when both are set.
See CONTRIBUTING.md for the full author guide. Quick shape per kind:
# config/extensions/your-id/manifest.yaml
capabilities:
# 1. Pull external data into the bench on a schedule.
- kind: ingest_source
name: my_runner # ← key in backend/app/capabilities/registry.py
config:
poll_interval_seconds: 60
# your runner's config fields
# 2. Palette entry (⌘K).
- kind: slash_command
name: do_thing
config:
label: "Do the thing"
keywords: [thing, do]
icon: zap
# handler_kind: api_call → POSTs to handler_target
# handler_kind: navigate → router.push(handler_target)
handler_kind: api_call
handler_target: /capabilities/runners/do_thing/trigger
# 3. Button on a specific item kind.
- kind: action_button
name: tag_with_x
config:
label: "Tag with X"
target: knowledge_node # which item renderer this attaches to
handler_kind: api_call
visible_when: # AND-of-ORs filter
node_type: [claim, hypothesis]Then register the Python runner / handler in
backend/app/capabilities/registry.py (or slash.py / actions.py)
and PR. Trust model = "registry as audit surface": capability code
ships with the framework, manifests reference runners by name. No
arbitrary file-drop, no eval, no extension-injected JSX.
- More capability runners — Gmail (OAuth), Calendar (CalDAV / Google), Slack, Notion. Contributions welcome.
surface_widgetcapability kind — sub-component injected into an existing surface. Manifest schema reserves it today; runtime activation arrives next.slash_commandandaction_buttonare already shipped (seebench-extras,methods-drafter).- More content extensions —
indie-founder,phd-student,engineering-manager. Contributions welcome. - Settings → "Personalize" — re-run the wizard with prefilled prior answers.
- Custom surface types — not on the roadmap. The 6 surface types cover the framework's scope. Surface code stays in core.
Research roundtable personas can declare a grounding hint in their
YAML — at chat time, WorkspaceOS looks up the persona's recent papers
via Semantic Scholar (24h-cached) and prepends the paper titles to
the system prompt. "Drew Endy says X" is anchored to actual Endy
publications instead of fabricated.
- id: drew_endy
name: Drew Endy
color: "#ef4444"
system_prompt: |
You are Drew Endy …
grounding:
source: semantic_scholar # other sources reserved
query: "Drew Endy synthetic biology"
max_papers: 5The bio-research extension's reviewers all ship with grounding set.
Fails gracefully — if Semantic Scholar is down or rate-limited, the
persona reverts to ungrounded behavior; the chat keeps working.
The bench has a floating Feedback button (bottom-right). Click it,
write what broke or what you wished it did, and the backend files a
GitHub issue on Chesterguan/WorkspaceOS (configurable via
FEEDBACK_REPO in .env) with auto-captured page context — current
surface, project id, URL, last 10 bench events. Issue labeled
user-feedback + bug / enhancement / question.
Needs GITHUB_TOKEN with issues:write scope. Disabled gracefully
if the token is missing — the modal returns a clear error instead of
silently failing.
OSS-targeted, MIT licensed (see LICENSE). The bench, six surfaces, extension framework, onboarding wizard, knowledge graph, worklog generator, and paper pipeline v2 all work today. Multi-tenant deployment is not yet hardened — see Security notes in CONTRIBUTING.md.
- Next.js 16 conformance — uses
proxy.ts(not deprecatedmiddleware.ts),useSearchParamswrapped in<Suspense>, dynamic params viause(). - Knowledge dedup — per-user
asyncio.Lockserializes concurrent advisor extractions so cosine-near nodes from one roundtable turn merge instead of duplicating. - Event SSE auth — falls back to a
?api_key=query string becauseEventSourcecan't set custom headers. Fine for single-tenant demo, not safe for shared deployment without a short-lived SSE token exchange. - Reduced motion respected — WCAG 2.3.3 honored globally.
Pull requests welcome — especially new content extensions. See CONTRIBUTING.md for the authoring guide.
MIT. See LICENSE.


