Skip to content

jeziellopes/idempo

Repository files navigation

⚔️ 𝔦𝔡𝔢𝔪𝔭𝔬

A production-grade distributed systems reference built as a real-time tactical arena game — where the idempotency token is the game's core resource.

CI Status License
TypeScript NestJS Next.js pnpm
Kafka PostgreSQL Redis Docker Kubernetes Prometheus Grafana

🚧 Active development — Layer 0 + Iteration 1 complete with E2E coverage. Iteration 2 in progress.


Overview

Players join live arena matches, fight to collect resources, then trade in an async economy between rounds. The game's core resource — the idempo Stamp — is the game-layer representation of an idempotency key: spending a Stamp seals an action, guaranteeing exactly-once resolution even under network retries. Under the hood, every interaction exercises a production-grade distributed systems pattern — from idempotent command handling to distributed Saga compensation.

This project exists to demonstrate — concretely and runnably — what top-tier distributed systems engineering looks like.


Table of Contents


Architecture

Active development — architecture stable, features expanding per ROADMAP.md.

graph TB
    UI["🖥️ Next.js UI<br/>WebSocket + REST"]
    GW["🧠 API Gateway<br/>Auth · Rate Limit · Correlation ID"]

    subgraph Core Services
        GS["🎮 Game Service"]
        CS["⚔️ Combat Service"]
        RS["🎁 Reward Service"]
        WS["💰 Wallet Service"]
        IS["📦 Inventory Service"]
        MS["🏪 Marketplace Service"]
        LS["🏆 Leaderboard Service"]
        NS["📢 Notification Service"]
    end

    subgraph Kafka["📨 Apache Kafka"]
        T1["player-actions"]
        T2["match-events"]
        T3["economy-events"]
        T4["leaderboard-events"]
        DLQ["*.dlq  (Dead Letter)"]
    end

    subgraph Storage
        PG1[("game_db")]
        PG2[("wallet_db")]
        PG3[("marketplace_db")]
        PG4[("inventory_db")]
        RD[("Redis<br/>Top 100")]
    end

    UI -->|WS / REST| GW
    GW --> GS
    GW --> MS
    GW --> LS

    GS --> CS
    GS --> T1
    GS --> T2
    CS --> T2
    RS --> T3
    T2 --> RS

    MS <-->|Saga commands| WS
    MS <-->|Saga commands| IS
    MS --> T3

    T3 --> LS
    T3 --> NS
    T4 --> LS

    GS --- PG1
    WS --- PG2
    MS --- PG3
    IS --- PG4
    LS --- RD
Loading

Game Loop

flowchart LR
    A([Player joins]) --> B["Live Arena Match<br/>2–6 players · 3–5 min"]
    B --> C{Match ends}
    C --> D["MatchFinishedEvent<br/>emitted to Kafka"]
    D --> E["Reward Service<br/>grants currency · items · Stamps"]
    E --> F["Economy Phase<br/>Sell · Buy · Trade · Craft"]
    F --> G["Leaderboard<br/>updated"]
    G --> A
Loading

Services

Service Responsibility DB Emits
API Gateway Auth, rate limiting, correlation ID
Game Service Match lifecycle, action validation game_db match-events
Combat Service Damage calc, death logic stateless match-events
Reward Service Post-match reward grants economy-events
Wallet Service Currency debit/credit with strong consistency wallet_db economy-events
Inventory Service Item ownership and trade locks inventory_db economy-events
Marketplace Service Listings + Saga orchestrator marketplace_db economy-events
Leaderboard Service CQRS read projection leaderboard_db + Redis
Notification Service Async push / email stateless

Patterns Demonstrated

Pattern Where
Idempotent HTTP commands API Gateway → Game Service (X-Idempotency-Key)
idempo Stamp (game mechanic) Player spends a Stamp → stampId becomes action_id in player_actions; duplicate requests return original response
Idempotent event consumers All Kafka consumers (processed_events table)
Distributed Saga (choreography) Marketplace trade flow
Saga compensation Trade rollback on any step failure
Circuit breaker Marketplace → Wallet / Inventory (opossum)
Retry + exponential backoff + jitter All inter-service HTTP calls
Dead Letter Queue Failed Kafka messages after 3 retries
CQRS Leaderboard write model vs Redis read projection
Optimistic locking Wallet balance updates
Event sourcing (append-only ledger) Wallet transactions table
Partition-based ordering Kafka keyed by playerId

Saga: Trade Flow

sequenceDiagram
    actor Buyer
    participant MP as Marketplace Service
    participant WL as Wallet Service
    participant IN as Inventory Service

    Buyer->>MP: POST /trade
    MP->>MP: INSERT saga_log (INITIATED)

    MP->>WL: ReserveFundsCommand
    WL-->>MP: FundsReservedEvent
    MP->>MP: saga_log → ITEM_LOCKING

    MP->>IN: LockItemCommand
    IN-->>MP: ItemLockedEvent
    MP->>MP: saga_log → FUNDS_TRANSFERRING

    MP->>WL: TransferFundsCommand
    MP->>IN: TransferItemCommand
    WL-->>MP: FundsTransferredEvent
    IN-->>MP: ItemTransferredEvent
    MP->>MP: saga_log → COMPLETED
    MP-->>Buyer: 200 Trade complete
Loading

Compensation path (if TransferFundsCommand fails):

flowchart LR
    F([Transfer fails]) --> C1["ReleaseFundsCommand<br/>→ Wallet refunds buyer"]
    F --> C2["UnlockItemCommand<br/>→ Inventory releases item"]
    C1 --> E(["trade = FAILED<br/>Buyer notified"])
    C2 --> E
Loading

Idempotency Model

flowchart TD
    A["Request arrives<br/>X-Idempotency-Key: uuid"] --> B{"action_id<br/>in DB?"}
    B -- Yes --> C[Return cached response<br/>no side effects]
    B -- No --> D[Process business logic]
    D --> E[INSERT action_id<br/>atomically]
    E --> F[Return new response]
Loading

Kafka consumers mirror this — every handler checks processed_events before acting, inside the same DB transaction as the business write.


Observability

graph LR
    SVC["All NestJS Services"] -->|metrics /metrics| PROM["Prometheus"]
    SVC -->|traces| OT["OpenTelemetry Collector"]
    SVC -->|structured JSON logs| LOKI["Loki"]

    PROM --> GRAF["Grafana<br/>Dashboards"]
    OT --> JAEGER["Jaeger<br/>Trace UI"]
    LOKI --> GRAF
Loading

Key metrics exposed per service:

  • http_request_duration_seconds — latency histograms
  • kafka_consumer_lag — per topic/consumer group
  • circuit_breaker_state — open/closed/half-open gauge
  • saga_duration_seconds — trade completion time
  • dlq_message_count_total — dead letter accumulation
  • retry_count_total — retry pressure

Tech Stack

LayerTechnology
FrontendNext.js 16 (App Router) · socket.io · shadcn/ui · Tailwind CSS v4 · Zustand
BackendNestJS 11 · Apache Kafka · PostgreSQL 17 · Redis 7.4 LTS
Resilienceopossum (circuit breaker) · axios-retry · Kafka DLQ
ObservabilityPrometheus · Grafana · Jaeger · Loki · OpenTelemetry SDK · Pino
InfrastructureDocker Compose (local) · Kubernetes · Helm · KEDA · Nx monorepo · pnpm

Running the Stack

Every iteration has a working, runnable version. The two commands below validate any iteration end-to-end:

# 0. One-time setup — copy the env template
cp .env.example .env
# Edit .env: set JWT_SECRET (required). For shared/staging environments, also set KAFKA_CLUSTER_ID.

# 1. Build all app artifacts on the host (Nx handles caching — fast on repeat runs)
pnpm build

# 2. Start all infrastructure + app services
docker compose up -d --build

# 3. Run the E2E suite for a specific iteration (or all)
nx run e2e:e2e                              # all iterations
nx run e2e:e2e --testFile=iter1.e2e.ts     # Iteration 1 only
nx run e2e:e2e --testFile=iter2.e2e.ts     # Iteration 2 only
nx run e2e:e2e --testFile=iter3.e2e.ts     # Iteration 3 only
nx run e2e:e2e --testFile=iter4.e2e.ts     # Iteration 4 only

Unit + integration coverage is run separately:

pnpm coverage      # all services — enforces per-iteration coverage gates

An iteration is only done when both commands exit green. See ROADMAP.md for the per-iteration Verification scenarios and apps/e2e/ for the E2E test source.


Project Status

Deliverable Status
Documentation
PRD, SPEC, API, GAME, RUNBOOK, OBSERVABILITY, DEPLOYMENT ✅ Active
Architecture diagram ✅ Active
Build roadmap (ROADMAP.md) ✅ Active
ADR: monorepo (docs/adr/001-monorepo.md) ✅ Complete
Layer 0 — Boilerplate
Monorepo scaffold (Nx + pnpm) ✅ Complete
Shared packages (@idempo/contracts, kafka, observability, idempotency, circuit-breaker) ✅ Complete
Infrastructure (docker-compose.yml, Kafka, PostgreSQL, Redis, Jaeger, Prometheus, Grafana) ✅ Complete
API Gateway (auth, proxy, rate limiting, health checks) ✅ Complete
E2E test framework (apps/e2e) ✅ Complete
Iteration 1 — Playable Arena
Game Service (match lifecycle, idempotency, Stamp mechanics) ✅ Complete
Combat Service (damage calc, event-driven) ✅ Complete
Leaderboard Service (CQRS, Redis cache) ✅ Complete
Arena UI (Next.js, WebSocket, live leaderboard) ✅ Complete
E2E tests (iter1.e2e.ts) ✅ Passing
Iteration 2 — Rewards & Inventory
Reward Service 🔵 In progress
Wallet Service ⬜ Not started
Inventory Service ⬜ Not started
Wallet + Inventory UI ⬜ Not started
Iteration 3 — Marketplace & Saga
Marketplace Service (Saga orchestrator) ⬜ Not started
Circuit breaker integration ⬜ Not started
Iteration 4 — Observability & Hardening
Grafana dashboards ⬜ Not started
Alert rules ⬜ Not started
Kubernetes manifests ⬜ Not started

Documentation

File Contents
docs/PRD.md Product requirements, user stories, feature scope
docs/SPEC.md System architecture, event contracts, database schemas, saga, resilience patterns
docs/GAME.md Arena mechanics: grid, combat resolution, actions, Stamp-sealed actions, scoring
docs/API.md REST + WebSocket contracts: all endpoints, request/response DTOs, error codes
ROADMAP.md 4-iteration build roadmap with per-iteration deliverables and task checklists
docs/RUNBOOK.md Step-by-step failure injection scenarios demonstrating each distributed systems pattern
docs/OBSERVABILITY.md Metrics catalogue, Grafana dashboards, tracing config, structured log schema, alerting
docs/DEPLOYMENT.md Container strategy, Kubernetes resources, Kafka partitioning, database scaling, quick-start
docs/adr/001-monorepo.md ADR: why monorepo with Nx was chosen over multi-repo

About

A production-grade distributed systems reference built as a real-time tactical arena game — where the idempotency token is the game's core resource.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors