ethanplusai · oguzseran-max · May 30, 2026 · May 30, 2026 · May 30, 2026 · May 31, 2026
diff --git a/.gitignore b/.gitignore
@@ -1,11 +1,16 @@
 # Environment
 .env
 .env.local
+.env.save
+
+# Runtime logs / pids (start_jarvis.sh)
+.run/
 
 # Dependencies
 node_modules/
 .venv/
 venv/
+whisper-venv/
 
 # Python
 __pycache__/
@@ -39,3 +44,7 @@ desktop-overlay/node_modules/
 context/
 *.db-shm
 *.db-wal
+
+# Gmail OAuth secrets
+gmail_credentials.json
+gmail_token.json
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -1,3 +1,7 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
 # JARVIS — Voice AI Assistant
 
 ## Overview
@@ -16,14 +20,47 @@ When a user clones this repo and starts Claude Code, help them:
 9. Open Chrome to http://localhost:5173
 10. Click to enable audio, speak to JARVIS
 
+## Commands
+Run the app in two terminals: `python server.py` (backend, secure WebSocket — needs `cert.pem`/`key.pem`) and `cd frontend && npm run dev` (frontend on http://localhost:5173, must be Chrome for the Web Speech API).
+
+Frontend build/typecheck: `cd frontend && npm run build` (runs `tsc` then `vite build`).
+
+Tests live in `tests/` in two styles, and most call the real Anthropic API, so `ANTHROPIC_API_KEY` must be set (tests self-load `.env`):
+- pytest suites: `pytest tests/`; single test by name: `pytest tests/test_e2e_pipeline.py -k <name>`
+- standalone scripts (have `__main__`): `python3 tests/test_classifier.py`
+- `pytest`/`pytest-asyncio` are NOT in `requirements.txt` — install them separately to run the pytest suites.
+
+Live quality monitor (run alongside the server): `python monitor.py` tails server logs and flags low-quality conversations.
+
 ## Architecture
-- **Backend**: FastAPI + Python (server.py, ~2300 lines)
+- **Backend**: FastAPI + Python (server.py, ~2700 lines)
 - **Frontend**: Vite + TypeScript + Three.js (audio-reactive orb)
 - **Communication**: WebSocket (JSON messages + binary audio)
 - **AI**: Claude Haiku for fast responses, Claude Opus for research
 - **TTS**: Fish Audio with JARVIS voice model
 - **System**: AppleScript for Calendar, Mail, Notes, Terminal integration
 
+### Request pipeline
+`server.py` is an intentional ~2700-line monolith (see CONTRIBUTING.md) and is the orchestrator; the `/ws/voice` handler is the core loop:
+1. Frontend captures mic audio (`audio_capture.ts`, MediaRecorder + VAD) and streams each utterance as binary over the WebSocket. The backend transcribes it via a local Whisper service (`whisper_service.py`, runs in `whisper-venv` / Python 3.12 on :8765) which also returns the language. A legacy browser-Web-Speech transcript path still exists as a fallback.
+2. `classify_intent()` calls Haiku (`claude-haiku-4-5-20251001`) to pick an intent and emit an `[ACTION:*]` tag.
+3. `execute_action` (actions.py) routes the tag to a system integration or a Claude Code spawn.
+4. Reply text → Fish Audio TTS → streamed back as binary audio while the orb reacts.
+
+Heavier paths use bigger models: deep research uses Opus (`claude-opus-4-6`) to write an HTML report, open it in the browser, and speak a Haiku summary; rolling session summaries run on Haiku in the background. Adding a capability usually means a new action tag + a classifier prompt update + a handler.
+
+### Two ways to spawn Claude Code
+- **Build dispatch** (`actions.py` + `dispatch_registry.py`): one-shot `claude -p` builds; `dispatch_registry` persists what's building / just-finished so JARVIS knows what "it" refers to.
+- **Work mode** (`work_mode.py`): persistent sessions tied to a project dir, resumed with `--continue`. `planner.py` runs a conversational plan→clarify→confirm flow before spawning.
+
+### Self-improvement loop
+A feedback system tunes the prompts sent to Claude Code (only makes sense read together): `templates.py` (prompt templates by task type) → `ab_testing.py` (assigns template versions) → `qa.py` (spawns `claude -p` to verify output, auto-retries) → `tracking.py` (success rates) → `evolution.py` (analyzes failures, generates improved template versions) → `learning.py` (request patterns / context pre-loading) → `suggestions.py` (one heuristic follow-up per task). `conversation.py` holds multi-turn planning context.
+
+### Storage — two separate SQLite DBs
+These are NOT shared; confirm which one a module uses before touching persistence:
+- `data/jarvis.db` — `memory.py` (FTS5 full-text memory) and `dispatch_registry.py`
+- `jarvis_data.db` (repo root) — `tracking.py`, `learning.py`, `ab_testing.py`, `evolution.py`
+
 ## Key Files
 - `server.py` — Main server, WebSocket handler, LLM integration, action system
 - `frontend/src/orb.ts` — Three.js particle orb visualization
@@ -42,12 +79,18 @@ When a user clones this repo and starts Claude Code, help them:
 - `FISH_API_KEY` (required) — Fish Audio TTS
 - `FISH_VOICE_ID` (optional) — Voice model ID
 - `USER_NAME` (optional) — Your name for JARVIS to use
-- `CALENDAR_ACCOUNTS` (optional) — Comma-separated calendar emails
+- `CALENDAR_ACCOUNTS` (optional) — Comma-separated calendar emails (empty = auto-discover all)
+- `JARVIS_SKIP_PERMISSIONS` (optional) — Defaults to `true`; the voice loop can't answer interactive `claude` permission prompts (they'd hang the subprocess). Set `false` only when running in a visible Terminal.
+- Weather overrides (optional): `WEATHER_LOCATION_LABEL`, `WEATHER_LATITUDE`, `WEATHER_LONGITUDE`, `WEATHER_UNIT` — defaults to public-IP geolocation, Fahrenheit.
 
 ## Conventions
 - JARVIS personality: British butler, dry wit, economy of language
 - Max 1-2 sentences per voice response
-- Action tags: [ACTION:BUILD], [ACTION:BROWSE], [ACTION:RESEARCH], etc.
-- AppleScript for all macOS integrations (no OAuth needed)
-- Read-only for Mail (safety by design)
+- Action tags: [ACTION:BUILD], [ACTION:BROWSE], [ACTION:RESEARCH], [ACTION:SCREEN], [ACTION:CAMERA], [ACTION:SENTIMENT], etc.
+- Market sentiment ([ACTION:SENTIMENT] / `_do_sentiment_lookup`): runs the external kukapay `market-sentiment` skill analyzer as a subprocess and speaks a one-line mood score. The script lives outside the repo at `~/bybit-mcp/.agents/skills/market-sentiment/scripts/sentiment_analyzer.py` and needs `requests`, so it's invoked with `SENTIMENT_PYTHON` (defaults to the bybit-mcp venv). Override both via `SENTIMENT_PYTHON` / `SENTIMENT_SCRIPT` env vars. News-based only — never present as trading advice.
+- Multilingual voice (English/French/Turkish): a top-left EN/FR/TR toggle sends `{type:"set_lang"}`; the chosen language is FORCED for Whisper transcription, the LLM reply, and the TTS voice (auto-detect proved unreliable on short utterances). Per-language Fish voices live in `_LANG_VOICE` — French and Turkish use private cloned voices (native speakers), English uses the MCU JARVIS voice. `whisper_service.py` peak-normalizes audio and accepts `?lang=` to force a language. Start it with `WHISPER_MODEL=base` for speed or `small` (default) for accuracy.
+- Camera (`camera.py`): on-demand single-frame webcam vision. The frame lives in the browser, so the backend requests it over the WebSocket (`{"type":"capture_camera"}`) and the frontend (`frontend/src/camera.ts`) captures one JPEG, **releases the camera immediately**, and replies (`{"type":"camera_frame"}`). Privacy by design — never a continuous feed, nothing recorded. Distinct from screen vision (`screen.py`), which is captured server-side.
+- AppleScript for all macOS integrations (no OAuth needed); all user-controlled strings MUST pass through `applescript_escape()` (actions.py) — injection guard, covered by `tests/test_applescript_escape.py`
+- Read-only for Mail (safety by design) — never add write paths to connected services (Mail, Calendar, Notes)
+- No telemetry/analytics; no external services beyond Anthropic and Fish Audio
 - SQLite for all local data storage
diff --git a/briefing.py b/briefing.py
@@ -0,0 +1,252 @@
+"""
+JARVIS Morning Briefing — gather the facts for the post-startup briefing.
+
+Pulls together the data sources that aren't already in server.py:
+  * traffic   — Google Directions API (live, traffic-aware ETA)
+  * weather   — Open-Meteo daily forecast (no key) for clothing advice
+  * portfolio — runs the user's track.py to refresh prices, parses the totals
+
+Mail, calendar and crypto-sentiment reuse the existing server.py helpers.
+Each function returns plain facts; server.py composes them into a spoken,
+language-appropriate briefing via the LLM.
+"""
+
+import asyncio
+import json
+import logging
+import os
+import re
+import urllib.parse
+import urllib.request
+from pathlib import Path
+
+log = logging.getLogger("jarvis.briefing")
+
+# Home → office, fixed for the user.
+HOME_ADDRESS = os.getenv("BRIEFING_HOME", "1146G Route des Mermes, 74140 Veigy-Foncenex, France")
+OFFICE_ADDRESS = os.getenv("BRIEFING_OFFICE", "Barclays Bank, 28-20 Chemin Grange-Canal, 1204 Geneva, Switzerland")
+
+# Veigy-Foncenex coordinates for the weather forecast.
+WEATHER_LAT = float(os.getenv("BRIEFING_LAT", "46.2755"))
+WEATHER_LON = float(os.getenv("BRIEFING_LON", "6.2925"))
+
+PORTFOLIO_DIR = Path(os.getenv(
+    "BRIEFING_PORTFOLIO_DIR",
+    str(Path.home() / "Desktop" / "research-balanced-investment-opportunities"),
+))
+
+
+def _get(url: str, timeout: float = 15.0) -> bytes:
+    req = urllib.request.Request(url, headers={"User-Agent": "JARVIS/1.0"})
+    return urllib.request.urlopen(req, timeout=timeout).read()
+
+
+# ---- Traffic -------------------------------------------------------------
+
+async def get_traffic() -> dict:
+    """Live traffic-aware ETA home → office via Google Directions."""
+    key = os.getenv("GOOGLE_MAPS_API_KEY", "").strip()
+    if not key:
+        return {"ok": False, "reason": "no_key"}
+
+    def _call():
+        params = {
+            "origin": HOME_ADDRESS, "destination": OFFICE_ADDRESS,
+            "departure_time": "now", "traffic_model": "best_guess",
+            "mode": "driving", "key": key,
+        }
+        url = "https://maps.googleapis.com/maps/api/directions/json?" + urllib.parse.urlencode(params)
+        return json.loads(_get(url))
+
+    try:
+        data = await asyncio.to_thread(_call)
+    except Exception as e:
+        log.warning(f"traffic fetch failed: {e}")
+        return {"ok": False, "reason": str(e)}
+
+    if data.get("status") != "OK":
+        return {"ok": False, "reason": data.get("error_message") or data.get("status")}
+
+    leg = data["routes"][0]["legs"][0]
+    normal = leg["duration"]["value"] // 60
+    traffic = leg.get("duration_in_traffic", {}).get("value", leg["duration"]["value"]) // 60
+    delay = traffic - normal
+    if delay >= 8:
+        condition = "heavy traffic"
+    elif delay >= 3:
+        condition = "moderate traffic"
+    else:
+        condition = "clear roads"
+    return {
+        "ok": True,
+        "distance": leg["distance"]["text"],
+        "eta_min": traffic,
+        "normal_min": normal,
+        "delay_min": delay,
+        "condition": condition,
+        "route": data["routes"][0].get("summary", ""),
+        "warnings": data["routes"][0].get("warnings", []),
+    }
+
+
+# ---- Weather -------------------------------------------------------------
+
+_WCODE = {
+    0: "clear sky", 1: "mainly clear", 2: "partly cloudy", 3: "overcast",
+    45: "fog", 48: "freezing fog", 51: "light drizzle", 53: "drizzle",
+    55: "heavy drizzle", 61: "light rain", 63: "rain", 65: "heavy rain",
+    71: "light snow", 73: "snow", 75: "heavy snow", 80: "rain showers",
+    81: "rain showers", 82: "violent rain showers", 95: "thunderstorm",
+    96: "thunderstorm with hail", 99: "thunderstorm with heavy hail",
+}
+
+
+async def get_weather() -> dict:
+    """Today's forecast (high/low, conditions, rain chance) for the home area."""
+    def _call():
+        params = {
+            "latitude": WEATHER_LAT, "longitude": WEATHER_LON,
+            "daily": "temperature_2m_max,temperature_2m_min,precipitation_probability_max,weathercode",
+            "current": "temperature_2m,weathercode",
+            "timezone": "auto", "forecast_days": 1,
+        }
+        url = "https://api.open-meteo.com/v1/forecast?" + urllib.parse.urlencode(params)
+        return json.loads(_get(url))
+
+    try:
+        d = await asyncio.to_thread(_call)
+        daily = d["daily"]
+        code = daily["weathercode"][0]
+        return {
+            "ok": True,
+            "high_c": round(daily["temperature_2m_max"][0]),
+            "low_c": round(daily["temperature_2m_min"][0]),
+            "current_c": round(d.get("current", {}).get("temperature_2m", daily["temperature_2m_max"][0])),
+            "rain_chance": daily["precipitation_probability_max"][0],
+            "conditions": _WCODE.get(code, "mixed conditions"),
+        }
+    except Exception as e:
+        log.warning(f"weather fetch failed: {e}")
+        return {"ok": False, "reason": str(e)}
+
+
+# ---- Portfolio -----------------------------------------------------------
+
+async def get_portfolio() -> dict:
+    """Refresh prices via the user's track.py, parse totals + movers."""
+    script = PORTFOLIO_DIR / "track.py"
+    if not script.exists():
+        return {"ok": False, "reason": "no_script"}
+    try:
+        proc = await asyncio.create_subprocess_exec(
+            "python3", str(script),
+            cwd=str(PORTFOLIO_DIR),
+            stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE,
+        )
+        out, _ = await asyncio.wait_for(proc.communicate(), timeout=30)
+    except Exception as e:
+        log.warning(f"portfolio refresh failed: {e}")
+        return {"ok": False, "reason": str(e)}
+
+    text = out.decode(errors="replace")
+    positions = []
+    total_value = total_gain_pct = None
+    for line in text.splitlines():
+        # e.g. "SPCE  162.25  $7.83  $1,269.90  $267.16  +26.6%"
+        m = re.match(r"\s*([A-Z]{2,6})\s+[\d.]+\s+\$[\d,]+\.\d+\s+\$[\d,\-]+\.\d+\s+\$[\d,\-]+\.\d+\s+([+\-][\d.]+)%", line)
+        if m:
+            positions.append({"ticker": m.group(1), "gain_pct": float(m.group(2))})
+        t = re.search(r"TOTAL\s+\$([\d,]+\.\d+)\s+\$[\d,\-]+\.\d+\s+([+\-][\d.]+)%", line)
+        if t:
+            total_value = t.group(1)
+            total_gain_pct = float(t.group(2))
+
+    movers = sorted(positions, key=lambda p: p["gain_pct"], reverse=True)
+    return {
+        "ok": total_value is not None,
+        "total_value": total_value,
+        "total_gain_pct": total_gain_pct,
+        "best": movers[0] if movers else None,
+        "worst": movers[-1] if movers else None,
+        "dashboard": str(PORTFOLIO_DIR / "dashboard.html"),
+    }
+
+
+# ---- Crypto sentiment (fast, concurrent) ---------------------------------
+
+_POS = ["adoption", "launch", "partnership", "etf", "rally", "breakthrough",
+        "growth", "approval", "bullish", "surge", "adopts", "soar", "gains"]
+_NEG = ["crash", "exploit", "hack", "delay", "liquidation", "depeg", "bearish",
+        "decline", "setback", "breach", "drop", "plunge", "selloff", "lawsuit"]
+_FEEDS = [
+    "https://www.coindesk.com/arc/outboundfeeds/rss/?outputType=xml",
+    "https://cointelegraph.com/rss",
+    "https://cryptopotato.com/feed/",
+    "https://bitcoinist.com/feed/",
+    "https://www.newsbtc.com/feed/",
+    "https://cryptonews.com/news/feed/",
+]
+
+
+def _fetch_feed(url: str) -> list[str]:
+    import xml.etree.ElementTree as ET
+    try:
+        root = ET.fromstring(_get(url, timeout=8))
+        out = []
+        for it in root.findall(".//item"):
+            title = it.findtext("title") or ""
+            desc = it.findtext("description") or ""
+            out.append((title + " " + desc).lower())
+        return out
+    except Exception:
+        return []
+
+
+async def get_sentiment() -> dict:
+    """Crypto news sentiment — fetches all feeds concurrently (~4s vs ~20s)."""
+    results = await asyncio.gather(*[asyncio.to_thread(_fetch_feed, u) for u in _FEEDS])
+    texts = [t for sub in results for t in sub]
+    if not texts:
+        return {"ok": False}
+    total = pos = neg = 0
+    for txt in texts:
+        p = sum(1 for w in _POS if w in txt)
+        n = sum(1 for w in _NEG if w in txt)
+        total += 1 if p > n else -1 if n > p else 0
+    count = len(texts)
+    score = total / count if count else 0.0
+    mood = "bullish" if score > 0.1 else "bearish" if score < -0.1 else "neutral"
+    return {"ok": True, "score": round(score, 2), "mood": mood, "articles": count}
+
+
+async def open_dashboard_window() -> None:
+    """Open the portfolio dashboard in a small Chrome app window."""
+    dash = PORTFOLIO_DIR / "dashboard.html"
+    if not dash.exists():
+        return
+    url = f"file://{dash}"
+    # Wide enough for the 8-column table + long position names, tall enough for
+    # all rows + totals + footer. Clamped to the main screen so it never exceeds it.
+    script = f'''
+tell application "Finder" to set sb to bounds of window of desktop
+set screenW to item 3 of sb
+set screenH to item 4 of sb
+set winW to 1040
+set winH to 760
+if winW > (screenW - 40) then set winW to (screenW - 40)
+if winH > (screenH - 80) then set winH to (screenH - 80)
+set x1 to 40
+set y1 to 60
+tell application "Google Chrome"
+    make new window
+    set URL of active tab of front window to "{url}"
+    set bounds of front window to {{x1, y1, x1 + winW, y1 + winH}}
+end tell
+'''
+    try:
+        await asyncio.create_subprocess_exec(
+            "osascript", "-e", script,
+            stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE,
+        )
+    except Exception as e:
+        log.warning(f"open dashboard window failed: {e}")