Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
# Environment
.env
.env.local
.env.save

# Runtime logs / pids (start_jarvis.sh)
.run/

# Dependencies
node_modules/
.venv/
venv/
whisper-venv/

# Python
__pycache__/
Expand Down Expand Up @@ -39,3 +44,7 @@ desktop-overlay/node_modules/
context/
*.db-shm
*.db-wal

# Gmail OAuth secrets
gmail_credentials.json
gmail_token.json
53 changes: 48 additions & 5 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

# JARVIS — Voice AI Assistant

## Overview
Expand All @@ -16,14 +20,47 @@ When a user clones this repo and starts Claude Code, help them:
9. Open Chrome to http://localhost:5173
10. Click to enable audio, speak to JARVIS

## Commands
Run the app in two terminals: `python server.py` (backend, secure WebSocket — needs `cert.pem`/`key.pem`) and `cd frontend && npm run dev` (frontend on http://localhost:5173, must be Chrome for the Web Speech API).

Frontend build/typecheck: `cd frontend && npm run build` (runs `tsc` then `vite build`).

Tests live in `tests/` in two styles, and most call the real Anthropic API, so `ANTHROPIC_API_KEY` must be set (tests self-load `.env`):
- pytest suites: `pytest tests/`; single test by name: `pytest tests/test_e2e_pipeline.py -k <name>`
- standalone scripts (have `__main__`): `python3 tests/test_classifier.py`
- `pytest`/`pytest-asyncio` are NOT in `requirements.txt` — install them separately to run the pytest suites.

Live quality monitor (run alongside the server): `python monitor.py` tails server logs and flags low-quality conversations.

## Architecture
- **Backend**: FastAPI + Python (server.py, ~2300 lines)
- **Backend**: FastAPI + Python (server.py, ~2700 lines)
- **Frontend**: Vite + TypeScript + Three.js (audio-reactive orb)
- **Communication**: WebSocket (JSON messages + binary audio)
- **AI**: Claude Haiku for fast responses, Claude Opus for research
- **TTS**: Fish Audio with JARVIS voice model
- **System**: AppleScript for Calendar, Mail, Notes, Terminal integration

### Request pipeline
`server.py` is an intentional ~2700-line monolith (see CONTRIBUTING.md) and is the orchestrator; the `/ws/voice` handler is the core loop:
1. Frontend captures mic audio (`audio_capture.ts`, MediaRecorder + VAD) and streams each utterance as binary over the WebSocket. The backend transcribes it via a local Whisper service (`whisper_service.py`, runs in `whisper-venv` / Python 3.12 on :8765) which also returns the language. A legacy browser-Web-Speech transcript path still exists as a fallback.
2. `classify_intent()` calls Haiku (`claude-haiku-4-5-20251001`) to pick an intent and emit an `[ACTION:*]` tag.
3. `execute_action` (actions.py) routes the tag to a system integration or a Claude Code spawn.
4. Reply text → Fish Audio TTS → streamed back as binary audio while the orb reacts.

Heavier paths use bigger models: deep research uses Opus (`claude-opus-4-6`) to write an HTML report, open it in the browser, and speak a Haiku summary; rolling session summaries run on Haiku in the background. Adding a capability usually means a new action tag + a classifier prompt update + a handler.

### Two ways to spawn Claude Code
- **Build dispatch** (`actions.py` + `dispatch_registry.py`): one-shot `claude -p` builds; `dispatch_registry` persists what's building / just-finished so JARVIS knows what "it" refers to.
- **Work mode** (`work_mode.py`): persistent sessions tied to a project dir, resumed with `--continue`. `planner.py` runs a conversational plan→clarify→confirm flow before spawning.

### Self-improvement loop
A feedback system tunes the prompts sent to Claude Code (only makes sense read together): `templates.py` (prompt templates by task type) → `ab_testing.py` (assigns template versions) → `qa.py` (spawns `claude -p` to verify output, auto-retries) → `tracking.py` (success rates) → `evolution.py` (analyzes failures, generates improved template versions) → `learning.py` (request patterns / context pre-loading) → `suggestions.py` (one heuristic follow-up per task). `conversation.py` holds multi-turn planning context.

### Storage — two separate SQLite DBs
These are NOT shared; confirm which one a module uses before touching persistence:
- `data/jarvis.db` — `memory.py` (FTS5 full-text memory) and `dispatch_registry.py`
- `jarvis_data.db` (repo root) — `tracking.py`, `learning.py`, `ab_testing.py`, `evolution.py`

## Key Files
- `server.py` — Main server, WebSocket handler, LLM integration, action system
- `frontend/src/orb.ts` — Three.js particle orb visualization
Expand All @@ -42,12 +79,18 @@ When a user clones this repo and starts Claude Code, help them:
- `FISH_API_KEY` (required) — Fish Audio TTS
- `FISH_VOICE_ID` (optional) — Voice model ID
- `USER_NAME` (optional) — Your name for JARVIS to use
- `CALENDAR_ACCOUNTS` (optional) — Comma-separated calendar emails
- `CALENDAR_ACCOUNTS` (optional) — Comma-separated calendar emails (empty = auto-discover all)
- `JARVIS_SKIP_PERMISSIONS` (optional) — Defaults to `true`; the voice loop can't answer interactive `claude` permission prompts (they'd hang the subprocess). Set `false` only when running in a visible Terminal.
- Weather overrides (optional): `WEATHER_LOCATION_LABEL`, `WEATHER_LATITUDE`, `WEATHER_LONGITUDE`, `WEATHER_UNIT` — defaults to public-IP geolocation, Fahrenheit.

## Conventions
- JARVIS personality: British butler, dry wit, economy of language
- Max 1-2 sentences per voice response
- Action tags: [ACTION:BUILD], [ACTION:BROWSE], [ACTION:RESEARCH], etc.
- AppleScript for all macOS integrations (no OAuth needed)
- Read-only for Mail (safety by design)
- Action tags: [ACTION:BUILD], [ACTION:BROWSE], [ACTION:RESEARCH], [ACTION:SCREEN], [ACTION:CAMERA], [ACTION:SENTIMENT], etc.
- Market sentiment ([ACTION:SENTIMENT] / `_do_sentiment_lookup`): runs the external kukapay `market-sentiment` skill analyzer as a subprocess and speaks a one-line mood score. The script lives outside the repo at `~/bybit-mcp/.agents/skills/market-sentiment/scripts/sentiment_analyzer.py` and needs `requests`, so it's invoked with `SENTIMENT_PYTHON` (defaults to the bybit-mcp venv). Override both via `SENTIMENT_PYTHON` / `SENTIMENT_SCRIPT` env vars. News-based only — never present as trading advice.
- Multilingual voice (English/French/Turkish): a top-left EN/FR/TR toggle sends `{type:"set_lang"}`; the chosen language is FORCED for Whisper transcription, the LLM reply, and the TTS voice (auto-detect proved unreliable on short utterances). Per-language Fish voices live in `_LANG_VOICE` — French and Turkish use private cloned voices (native speakers), English uses the MCU JARVIS voice. `whisper_service.py` peak-normalizes audio and accepts `?lang=` to force a language. Start it with `WHISPER_MODEL=base` for speed or `small` (default) for accuracy.
- Camera (`camera.py`): on-demand single-frame webcam vision. The frame lives in the browser, so the backend requests it over the WebSocket (`{"type":"capture_camera"}`) and the frontend (`frontend/src/camera.ts`) captures one JPEG, **releases the camera immediately**, and replies (`{"type":"camera_frame"}`). Privacy by design — never a continuous feed, nothing recorded. Distinct from screen vision (`screen.py`), which is captured server-side.
- AppleScript for all macOS integrations (no OAuth needed); all user-controlled strings MUST pass through `applescript_escape()` (actions.py) — injection guard, covered by `tests/test_applescript_escape.py`
- Read-only for Mail (safety by design) — never add write paths to connected services (Mail, Calendar, Notes)
- No telemetry/analytics; no external services beyond Anthropic and Fish Audio
- SQLite for all local data storage
252 changes: 252 additions & 0 deletions briefing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
"""
JARVIS Morning Briefing — gather the facts for the post-startup briefing.

Pulls together the data sources that aren't already in server.py:
* traffic — Google Directions API (live, traffic-aware ETA)
* weather — Open-Meteo daily forecast (no key) for clothing advice
* portfolio — runs the user's track.py to refresh prices, parses the totals

Mail, calendar and crypto-sentiment reuse the existing server.py helpers.
Each function returns plain facts; server.py composes them into a spoken,
language-appropriate briefing via the LLM.
"""

import asyncio
import json
import logging
import os
import re
import urllib.parse
import urllib.request
from pathlib import Path

log = logging.getLogger("jarvis.briefing")

# Home → office, fixed for the user.
HOME_ADDRESS = os.getenv("BRIEFING_HOME", "1146G Route des Mermes, 74140 Veigy-Foncenex, France")
OFFICE_ADDRESS = os.getenv("BRIEFING_OFFICE", "Barclays Bank, 28-20 Chemin Grange-Canal, 1204 Geneva, Switzerland")

# Veigy-Foncenex coordinates for the weather forecast.
WEATHER_LAT = float(os.getenv("BRIEFING_LAT", "46.2755"))
WEATHER_LON = float(os.getenv("BRIEFING_LON", "6.2925"))

PORTFOLIO_DIR = Path(os.getenv(
"BRIEFING_PORTFOLIO_DIR",
str(Path.home() / "Desktop" / "research-balanced-investment-opportunities"),
))


def _get(url: str, timeout: float = 15.0) -> bytes:
req = urllib.request.Request(url, headers={"User-Agent": "JARVIS/1.0"})
return urllib.request.urlopen(req, timeout=timeout).read()


# ---- Traffic -------------------------------------------------------------

async def get_traffic() -> dict:
"""Live traffic-aware ETA home → office via Google Directions."""
key = os.getenv("GOOGLE_MAPS_API_KEY", "").strip()
if not key:
return {"ok": False, "reason": "no_key"}

def _call():
params = {
"origin": HOME_ADDRESS, "destination": OFFICE_ADDRESS,
"departure_time": "now", "traffic_model": "best_guess",
"mode": "driving", "key": key,
}
url = "https://maps.googleapis.com/maps/api/directions/json?" + urllib.parse.urlencode(params)
return json.loads(_get(url))

try:
data = await asyncio.to_thread(_call)
except Exception as e:
log.warning(f"traffic fetch failed: {e}")
return {"ok": False, "reason": str(e)}

if data.get("status") != "OK":
return {"ok": False, "reason": data.get("error_message") or data.get("status")}

leg = data["routes"][0]["legs"][0]
normal = leg["duration"]["value"] // 60
traffic = leg.get("duration_in_traffic", {}).get("value", leg["duration"]["value"]) // 60
delay = traffic - normal
if delay >= 8:
condition = "heavy traffic"
elif delay >= 3:
condition = "moderate traffic"
else:
condition = "clear roads"
return {
"ok": True,
"distance": leg["distance"]["text"],
"eta_min": traffic,
"normal_min": normal,
"delay_min": delay,
"condition": condition,
"route": data["routes"][0].get("summary", ""),
"warnings": data["routes"][0].get("warnings", []),
}


# ---- Weather -------------------------------------------------------------

_WCODE = {
0: "clear sky", 1: "mainly clear", 2: "partly cloudy", 3: "overcast",
45: "fog", 48: "freezing fog", 51: "light drizzle", 53: "drizzle",
55: "heavy drizzle", 61: "light rain", 63: "rain", 65: "heavy rain",
71: "light snow", 73: "snow", 75: "heavy snow", 80: "rain showers",
81: "rain showers", 82: "violent rain showers", 95: "thunderstorm",
96: "thunderstorm with hail", 99: "thunderstorm with heavy hail",
}


async def get_weather() -> dict:
"""Today's forecast (high/low, conditions, rain chance) for the home area."""
def _call():
params = {
"latitude": WEATHER_LAT, "longitude": WEATHER_LON,
"daily": "temperature_2m_max,temperature_2m_min,precipitation_probability_max,weathercode",
"current": "temperature_2m,weathercode",
"timezone": "auto", "forecast_days": 1,
}
url = "https://api.open-meteo.com/v1/forecast?" + urllib.parse.urlencode(params)
return json.loads(_get(url))

try:
d = await asyncio.to_thread(_call)
daily = d["daily"]
code = daily["weathercode"][0]
return {
"ok": True,
"high_c": round(daily["temperature_2m_max"][0]),
"low_c": round(daily["temperature_2m_min"][0]),
"current_c": round(d.get("current", {}).get("temperature_2m", daily["temperature_2m_max"][0])),
"rain_chance": daily["precipitation_probability_max"][0],
"conditions": _WCODE.get(code, "mixed conditions"),
}
except Exception as e:
log.warning(f"weather fetch failed: {e}")
return {"ok": False, "reason": str(e)}


# ---- Portfolio -----------------------------------------------------------

async def get_portfolio() -> dict:
"""Refresh prices via the user's track.py, parse totals + movers."""
script = PORTFOLIO_DIR / "track.py"
if not script.exists():
return {"ok": False, "reason": "no_script"}
try:
proc = await asyncio.create_subprocess_exec(
"python3", str(script),
cwd=str(PORTFOLIO_DIR),
stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE,
)
out, _ = await asyncio.wait_for(proc.communicate(), timeout=30)
except Exception as e:
log.warning(f"portfolio refresh failed: {e}")
return {"ok": False, "reason": str(e)}

text = out.decode(errors="replace")
positions = []
total_value = total_gain_pct = None
for line in text.splitlines():
# e.g. "SPCE 162.25 $7.83 $1,269.90 $267.16 +26.6%"
m = re.match(r"\s*([A-Z]{2,6})\s+[\d.]+\s+\$[\d,]+\.\d+\s+\$[\d,\-]+\.\d+\s+\$[\d,\-]+\.\d+\s+([+\-][\d.]+)%", line)
if m:
positions.append({"ticker": m.group(1), "gain_pct": float(m.group(2))})
t = re.search(r"TOTAL\s+\$([\d,]+\.\d+)\s+\$[\d,\-]+\.\d+\s+([+\-][\d.]+)%", line)
if t:
total_value = t.group(1)
total_gain_pct = float(t.group(2))

movers = sorted(positions, key=lambda p: p["gain_pct"], reverse=True)
return {
"ok": total_value is not None,
"total_value": total_value,
"total_gain_pct": total_gain_pct,
"best": movers[0] if movers else None,
"worst": movers[-1] if movers else None,
"dashboard": str(PORTFOLIO_DIR / "dashboard.html"),
}


# ---- Crypto sentiment (fast, concurrent) ---------------------------------

_POS = ["adoption", "launch", "partnership", "etf", "rally", "breakthrough",
"growth", "approval", "bullish", "surge", "adopts", "soar", "gains"]
_NEG = ["crash", "exploit", "hack", "delay", "liquidation", "depeg", "bearish",
"decline", "setback", "breach", "drop", "plunge", "selloff", "lawsuit"]
_FEEDS = [
"https://www.coindesk.com/arc/outboundfeeds/rss/?outputType=xml",
"https://cointelegraph.com/rss",
"https://cryptopotato.com/feed/",
"https://bitcoinist.com/feed/",
"https://www.newsbtc.com/feed/",
"https://cryptonews.com/news/feed/",
]


def _fetch_feed(url: str) -> list[str]:
import xml.etree.ElementTree as ET
try:
root = ET.fromstring(_get(url, timeout=8))
out = []
for it in root.findall(".//item"):
title = it.findtext("title") or ""
desc = it.findtext("description") or ""
out.append((title + " " + desc).lower())
return out
except Exception:
return []


async def get_sentiment() -> dict:
"""Crypto news sentiment — fetches all feeds concurrently (~4s vs ~20s)."""
results = await asyncio.gather(*[asyncio.to_thread(_fetch_feed, u) for u in _FEEDS])
texts = [t for sub in results for t in sub]
if not texts:
return {"ok": False}
total = pos = neg = 0
for txt in texts:
p = sum(1 for w in _POS if w in txt)
n = sum(1 for w in _NEG if w in txt)
total += 1 if p > n else -1 if n > p else 0
count = len(texts)
score = total / count if count else 0.0
mood = "bullish" if score > 0.1 else "bearish" if score < -0.1 else "neutral"
return {"ok": True, "score": round(score, 2), "mood": mood, "articles": count}


async def open_dashboard_window() -> None:
"""Open the portfolio dashboard in a small Chrome app window."""
dash = PORTFOLIO_DIR / "dashboard.html"
if not dash.exists():
return
url = f"file://{dash}"
# Wide enough for the 8-column table + long position names, tall enough for
# all rows + totals + footer. Clamped to the main screen so it never exceeds it.
script = f'''
tell application "Finder" to set sb to bounds of window of desktop
set screenW to item 3 of sb
set screenH to item 4 of sb
set winW to 1040
set winH to 760
if winW > (screenW - 40) then set winW to (screenW - 40)
if winH > (screenH - 80) then set winH to (screenH - 80)
set x1 to 40
set y1 to 60
tell application "Google Chrome"
make new window
set URL of active tab of front window to "{url}"
set bounds of front window to {{x1, y1, x1 + winW, y1 + winH}}
end tell
'''
try:
await asyncio.create_subprocess_exec(
"osascript", "-e", script,
stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE,
)
except Exception as e:
log.warning(f"open dashboard window failed: {e}")
Loading