Kitsune

A distributed orchestration system for executing workflows across multiple servers using Temporal. Kitsune enables coordinated deployments, patches, and operations across server fleets with flexible rollout strategies.

Architecture

Kitsune uses a hybrid architecture with two types of workers:

Central Orchestrator Worker

Runs the OrchestrationWorkflow that coordinates execution across servers
Listens on the execution-orchestrator task queue
Manages rollout strategies (Parallel, Sequential, Rolling)
Tracks overall execution progress and handles failures

Local Workers

One worker runs on each target server
Each worker listens on a server-specific task queue (e.g., server-1, server-2)
Executes the ServerExecutionWorkflow with steps specific to that server
Handles step execution, retries, and rollbacks

Features

Multiple Rollout Strategies
- Parallel: Execute on all servers simultaneously
- Sequential: Execute one server at a time
- Rolling: Execute in batches with configurable batch size and delays
Step Execution Framework
- Extensible step handler system
- Built-in handlers: echo, script, sleep, file_write
- Easy to add custom step types
Error Handling
- Configurable retry policies
- Continue-on-failure support for non-critical steps
- Automatic rollback on required step failures
- Max failures threshold to stop rollouts early

Project Structure

kitsune/
├── cmd/
│   ├── local-worker/          # Worker that runs on each server
│   └── orchestration-worker/  # Central orchestration coordinator
├── pkg/
│   ├── activities/            # Activity implementations
│   │   ├── handlers/          # Step handler implementations
│   │   ├── step_activities.go
│   │   └── step_handler.go
│   ├── models/                # Data models and types
│   │   └── types.go
│   └── workflows/             # Workflow implementations
│       ├── execution.go       # Server-level workflow
│       └── orchestration.go   # Orchestration workflow
└── dev/                       # Development utilities
    ├── docker-compose.yaml
    ├── test-orchestration.sh
    └── test-plan-orchestrator.json

Prerequisites

Go 1.25.3+
Docker and Docker Compose (for local development)
Temporal Server

Getting Started

Local Development with Docker Compose

The project includes a complete Docker Compose setup with Temporal, PostgreSQL, and mock servers:

cd dev
docker-compose up -d

This starts:

PostgreSQL (port 5432)
Temporal Server (port 7233)
Temporal UI (port 8080)
Central Orchestrator Worker
3 Mock Server Workers (server-1, server-2, server-3)

Running the Test Orchestration

cd dev
./test-orchestration.sh

This script:

Starts all services
Checks health of workers
Triggers an orchestration workflow across 3 servers
Shows execution results
Provides links to view in Temporal UI

Manual Execution

1. Start the Central Orchestrator Worker

export TEMPORAL_ADDRESS=localhost:7233
go run cmd/orchestration-worker/main.go

2. Start Local Workers on Each Server

On each target server:

export SERVER_ID=<server-name>
export TEMPORAL_ADDRESS=<temporal-host>:7233
go run cmd/local-worker/main.go

3. Trigger an Orchestration

Using the Temporal CLI:

temporal workflow start \
  --task-queue execution-orchestrator \
  --type OrchestrationWorkflow \
  --input '{
    "servers": ["server-1", "server-2", "server-3"],
    "steps": [
      {
        "name": "deploy",
        "type": "script",
        "params": {
          "command": "deploy.sh",
          "args": ["--version", "v1.2.3"]
        },
        "required": true
      }
    ],
    "rolloutStrategy": {
      "type": "Rolling",
      "batchSize": 1,
      "batchDelaySeconds": 30,
      "maxFailures": 1
    }
  }'

Configuration

Execution Request

{
  "servers": ["server-1", "server-2"],
  "steps": [
    {
      "name": "step-name",
      "type": "echo|script|sleep|file_write",
      "params": {},
      "required": true,
      "continueOnFailure": false
    }
  ],
  "rolloutStrategy": {
    "type": "Parallel|Sequential|Rolling",
    "batchSize": 1,
    "batchDelaySeconds": 0,
    "maxFailures": 0,
    "canaryPercentage": 10
  }
}

Step Types

Echo

Simple logging step:

{
  "name": "log-message",
  "type": "echo",
  "params": {
    "message": "Starting deployment"
  }
}

Script

Execute shell scripts:

{
  "name": "run-deploy",
  "type": "script",
  "params": {
    "command": "deploy.sh",
    "args": ["--version", "v1.2.3"]
  }
}

Sleep

Add delays:

{
  "name": "wait",
  "type": "sleep",
  "params": {
    "duration": "30s"
  }
}

File Write

Write files to disk:

{
  "name": "write-config",
  "type": "file_write",
  "params": {
    "path": "/etc/app/config.json",
    "content": "{\"key\": \"value\"}"
  }
}

Rollout Strategies

Parallel

Execute on all servers at once:

{
  "type": "Parallel"
}

Sequential

Execute one server at a time:

{
  "type": "Sequential",
  "maxFailures": 1
}

Rolling

Execute in batches:

{
  "type": "Rolling",
  "batchSize": 2,
  "batchDelaySeconds": 60,
  "maxFailures": 2
}

Adding Custom Step Handlers

Create a new handler in pkg/activities/handlers/:

package handlers

import (
    "context"
    "go.temporal.io/sdk/activity"
)

type CustomHandler struct{}

func (h *CustomHandler) Execute(ctx context.Context, params map[string]interface{}) error {
    logger := activity.GetLogger(ctx)
    // Your implementation here
    return nil
}

func (h *CustomHandler) Rollback(ctx context.Context, params map[string]interface{}) error {
    logger := activity.GetLogger(ctx)
    // Rollback logic here
    return nil
}

Register it in cmd/local-worker/main.go:

registry.Register("custom", &handlers.CustomHandler{})

Monitoring

Temporal UI

Access the Temporal UI at http://localhost:8080 to:

View workflow execution history
See step-by-step progress
Debug failures and retries
Inspect workflow inputs and outputs

Workflow Status

Check workflow status via CLI:

temporal workflow describe --workflow-id <workflow-id>

View Logs

For Docker Compose setup:

docker logs kitsune-orchestrator
docker logs kitsune-mock-server-1
docker logs kitsune-mock-server-2
docker logs kitsune-mock-server-3

Error Handling

Required Steps

If a step is marked as required: true and fails:

Workflow execution stops
Automatic rollback is triggered for already-executed steps
Workflow returns an error

Non-Required Steps

If a step has continueOnFailure: true:

Failure is logged but execution continues
Overall workflow can still succeed

Max Failures

Configure maxFailures in rollout strategy:

0: Stop on first failure
N: Allow up to N server failures before stopping rollout

Development

Build

go build ./cmd/local-worker
go build ./cmd/orchestration-worker

Run Tests

go test ./...

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
cmd		cmd
dev		dev
docs		docs
pkg		pkg
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
go.mod		go.mod
go.sum		go.sum

melslow/kitsune

Folders and files

Latest commit

History

Repository files navigation