Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions docs/architecture.mmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#4285F4', 'primaryTextColor': '#fff', 'primaryBorderColor': '#3367D6', 'lineColor': '#5F6368', 'secondaryColor': '#34A853', 'tertiaryColor': '#FBBC04'}}}%%
flowchart LR
subgraph Browser["Browser — Next.js 15 PWA"]
direction TB
UI["Voice UI\n+ Scene Display\n(Pure Renderer)"]
end

subgraph Backend["Go Backend — Cloud Run"]
direction TB
WS["WebSocket\nHandler"]
SM["Session\nManager"]
LP["Live API\nProxy"]
TE["Tool Executor"]
OP["Onboarding\nPipeline"]

WS --- SM
SM --- LP
LP --- TE
OP -.->|persona| SM
Comment on lines +16 to +19

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The connections within the Backend subgraph could be more accurate to better reflect the component interactions seen in the code. The current chained representation WS --- SM --- LP --- TE is a bit misleading.

Based on internal/handler/websocket.go, the WebSocket Handler (WS) creates and orchestrates the other main components. Also, the Onboarding Pipeline (OP) is invoked by the Tool Executor (TE) via a tool call, so a link between them would improve clarity.

Consider restructuring this block to show these relationships more clearly, for example:

        WS --> SM
        WS --> LP
        LP --> TE
        TE --> OP
        OP -.->|persona| SM
Loading

end

subgraph Gemini["Gemini API — 4 Models"]
direction TB
LIVE["Live API\nNative Audio"]
PRO["2.5 Pro\nVideo Analysis"]
FLASH["2.5 Flash\nImage (1-3s)"]
IMAGEN["Imagen 4\nHD (8-12s)"]
end

subgraph Cloud["Google Cloud Services"]
direction TB
FS[("Firestore\nSessions\nMemories")]
CS[("Storage\nBGM + Assets")]
YT["YouTube\nData API"]
end

UI <-->|"WebSocket\nPCM Audio + Events"| WS
LP <-->|"Bidi\nStreaming"| LIVE
TE -->|generate_scene| FLASH
TE -->|generate_scene| IMAGEN
Comment on lines +39 to +40

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve clarity on the two-stage progressive image generation, consider making the labels for the generate_scene tool more specific. The current diagram shows two identical arrows, which doesn't fully convey that one is a quick preview and the other is the final, high-quality image.

Updating the labels will make the progressive nature of this feature more apparent.

    TE -->|"generate_scene (preview)"| FLASH
    TE -->|"generate_scene (final)"| IMAGEN

TE -->|recall_memory| FS
TE -->|change_atmosphere| CS
OP -->|analyze video| PRO
OP -->|fetch metadata| YT
TE -->|end_reunion| CS

classDef browser fill:#E8F0FE,stroke:#4285F4,color:#1A73E8
classDef backend fill:#E6F4EA,stroke:#34A853,color:#137333
classDef gemini fill:#FCE8E6,stroke:#EA4335,color:#C5221F
classDef cloud fill:#F3E8FD,stroke:#A142F4,color:#7627BB

class Browser browser
class Backend backend
class Gemini gemini
class Cloud cloud
Binary file added docs/architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.