Production Deployment Guide

Operational learnings from Agent-OS and AgentMesh

Overview

This guide covers production deployment patterns for agentic systems, including CI/CD, observability, and operational best practices.

Repository Standards

Essential Files Checklist

Every production agent repository should include:

repository/
├── .github/
│   ├── workflows/
│   │   └── ci.yml           # Multi-stage CI pipeline
│   └── dependabot.yml       # Automated dependency updates
├── src/                     # Source code
├── tests/                   # Test suite
├── docs/                    # Documentation
├── examples/                # Working examples
├── CONTRIBUTING.md          # Contribution guidelines
├── SECURITY.md              # Security policy
├── CHANGELOG.md             # Version history
├── LICENSE                  # License file
├── README.md                # Project overview
├── pyproject.toml           # Project configuration
└── .pre-commit-config.yaml  # Code quality hooks

CI Pipeline Stages

A production-ready CI pipeline should include:

# 7-stage pipeline (from Agent-OS)
jobs:
  test:      # Unit + integration tests with coverage
  lint:      # Code quality (ruff, mypy)
  security:  # SAST (bandit) + CVE scan (pip-audit)
  benchmark: # Performance regression tests
  build:     # Package build + validation
  publish:   # PyPI release on tags
  demo:      # Example validation

Branch Protection

Enable these protections on master/main:

Require pull request before merging
Require at least 1 approval
Require status checks to pass
Require conversation resolution
Do not allow bypassing settings

Performance Guidelines

Kernel Creation Benchmark

From Agent-OS: Kernel instantiation should be <50ms

# Benchmark test
iterations = 100
start = time.perf_counter()
for _ in range(iterations):
    kernel = KernelSpace()
elapsed = time.perf_counter() - start

avg_ms = (elapsed / iterations) * 1000
assert avg_ms < 50, f"Kernel creation {avg_ms}ms exceeds 50ms threshold"

Policy Enforcement

Target: <10ms policy evaluation
Use deterministic control planes, not LLM inference
Cache policy decisions where safe

Memory Footprint

Minimal core dependencies
Optional feature bundles
Lazy loading for heavy modules

Security Checklist

Dependency Management

# dependabot.yml - Weekly updates, grouped
updates:
  - package-ecosystem: "pip"
    schedule:
      interval: "weekly"
    groups:
      security-updates:
        patterns: ["*"]
        update-types: ["security"]

Security Scanning

Run on every PR:

# SAST (Static Application Security Testing)
bandit -r src/ -ll

# CVE vulnerability scanning
pip-audit --strict

# Secret detection
detect-secrets scan

Credential Management

Never commit secrets
Use environment variables or secret managers
Rotate credentials regularly
Prefer short-lived tokens

Observability

Structured Logging

import structlog

logger = structlog.get_logger()

logger.info(
    "agent_action",
    agent_id="did:mesh:my-agent",
    action="data_access",
    resource="customer_database",
    outcome="allowed",
    latency_ms=12,
)

Metrics to Track

Metric	Target	Alert Threshold
Policy evaluation latency	<10ms	>50ms
Trust handshake latency	<100ms	>500ms
Credential rotation success	100%	<99%
Audit log write latency	<5ms	>20ms

Health Checks

@app.get("/health")
async def health():
    return {
        "status": "healthy",
        "checks": {
            "policy_engine": await policy_engine.health(),
            "credential_store": await cred_store.health(),
            "audit_log": await audit.health(),
        }
    }

Scaling Patterns

Horizontal Scaling

Agent mesh components scale horizontally:

┌─────────────────────────────────────────────────────┐
│                   Load Balancer                      │
├─────────────────────────────────────────────────────┤
│  Policy     │  Policy     │  Policy     │  Policy   │
│  Engine 1   │  Engine 2   │  Engine 3   │  Engine N │
├─────────────────────────────────────────────────────┤
│              Shared State (Redis/etcd)              │
└─────────────────────────────────────────────────────┘

Caching Strategy

# Cache policy decisions (with TTL)
@lru_cache_with_ttl(ttl_seconds=60)
def evaluate_policy(agent_id: str, action: str) -> PolicyDecision:
    return policy_engine.evaluate(agent_id, action)

Circuit Breakers

Prevent cascade failures:

from circuitbreaker import circuit

@circuit(failure_threshold=5, recovery_timeout=30)
async def call_external_service(request):
    return await external_service.call(request)

Deployment Checklist

Pre-Production

Production Release

Tag with semantic version
CI publishes to package registry
Monitor error rates post-deploy
Have rollback plan ready

Post-Production

Monitor metrics dashboards
Review audit logs
Collect feedback
Plan next iteration

Troubleshooting

Common Issues

Issue	Likely Cause	Solution
Slow policy evaluation	Complex rules	Simplify or cache
Trust handshake timeout	Network/firewall	Check connectivity
Credential rotation failure	CA unavailable	Check CA health
Audit log gaps	Write failures	Check storage

Debug Mode

# Enable verbose logging
import logging
logging.getLogger("agentmesh").setLevel(logging.DEBUG)

# Enable request tracing
os.environ["AGENTMESH_TRACE"] = "1"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Production Deployment Guide

Overview

Repository Standards

Essential Files Checklist

CI Pipeline Stages

Branch Protection

Performance Guidelines

Kernel Creation Benchmark

Policy Enforcement

Memory Footprint

Security Checklist

Dependency Management

Security Scanning

Credential Management

Observability

Structured Logging

Metrics to Track

Health Checks

Scaling Patterns

Horizontal Scaling

Caching Strategy

Circuit Breakers

Deployment Checklist

Pre-Production

Production Release

Post-Production

Troubleshooting

Common Issues

Debug Mode

Further Reading

FilesExpand file tree

production-deployment-guide.md

Latest commit

History

production-deployment-guide.md

File metadata and controls

Production Deployment Guide

Overview

Repository Standards

Essential Files Checklist

CI Pipeline Stages

Branch Protection

Performance Guidelines

Kernel Creation Benchmark

Policy Enforcement

Memory Footprint

Security Checklist

Dependency Management

Security Scanning

Credential Management

Observability

Structured Logging

Metrics to Track

Health Checks

Scaling Patterns

Horizontal Scaling

Caching Strategy

Circuit Breakers

Deployment Checklist

Pre-Production

Production Release

Post-Production

Troubleshooting

Common Issues

Debug Mode

Further Reading