Production Playbooks

Comprehensive technical guides for building production-grade Claude Code plugin systems. Each playbook provides deep implementation details, production-ready code examples, and real-world patterns learned from operating large-scale AI agent deployments.

📚 Complete Playbook Collection

AI Architecture & Tool Use

11. Advanced Tool Use (~6,500 words) ⭐ NEW Dynamic tool discovery, programmatic orchestration, and parameter guidance. Tool Search Tool (85% token reduction), Programmatic Tool Calling (37% efficiency gains), and Tool Use Examples (90% parameter accuracy). Enterprise-scale agent architecture.

Cost Management & Optimization

01. Multi-Agent Rate Limits (~2,800 words) Prevent API throttling in concurrent multi-agent systems. Token bucket algorithms, sliding windows, priority queues, and backpressure handling for Claude API rate limits.

02. Cost Caps & Budget Management (~3,200 words) Hard budget controls for AI spending. Real-time spend tracking, automatic shutoffs, team quotas, and financial safeguards to prevent runaway costs.

09. Cost Attribution System (~5,500 words) Multi-dimensional cost tracking (team/project/user/workflow). Automatic tagging, chargeback models, budget enforcement, and usage analytics for AI operations.

Infrastructure & Deployment

03. MCP Server Reliability (~3,500 words) Self-healing MCP servers with circuit breakers, exponential backoff, health checks, and automatic recovery. Production-grade Model Context Protocol implementations.

04. Ollama Migration Guide (~4,500 words) Switch from OpenAI/Anthropic to self-hosted LLMs. Complete migration path: local setup, prompt translation, performance benchmarks, and cost analysis.

06. Self-Hosted Stack Setup (~5,500 words) Full infrastructure deployment with Docker/Kubernetes. Ollama, PostgreSQL, Redis, Prometheus, Grafana, Nginx - complete production stack with monitoring and backups.

Operations & Reliability

05. Incident Debugging Playbook (~5,000 words) SEV-1/2/3/4 incident response protocols. Log analysis, root cause investigation (5 Whys, Fishbone), postmortem templates, and on-call procedures.

10. Progressive Enhancement Patterns (~5,500 words) Safe AI feature rollout strategies. Feature flags (0% → 100%), A/B testing, canary deployments, graceful degradation, and automated rollback on failures.

Compliance & Governance

07. Compliance & Audit Guide (~6,000 words) SOC 2, GDPR, HIPAA, PCI DSS implementation. Audit logging with immutable signatures, RBAC, data privacy (PII redaction), and regulatory compliance.

08. Team Presets & Workflows (~5,000 words) Team standardization and collaboration. Plugin bundles, workflow templates, automated onboarding, and multi-layer configuration hierarchy (org/team/project/individual).

📊 Statistics

Total Content: ~53,500 words across 11 playbooks
Average Length: 4,900 words per playbook (range: 2,800 - 6,500 words)
Code Examples: 120+ production-ready TypeScript implementations
Topics Covered: 60+ production patterns
Coverage Areas: AI Architecture, Cost, Infrastructure, Operations, Compliance

🎯 Use Cases

For Plugin Developers

Learn production patterns for MCP servers
Implement cost controls and monitoring
Build self-hosted AI infrastructure
Create team-ready plugin bundles

For Engineering Teams

Standardize Claude Code workflows
Control AI spending with budget systems
Deploy compliant self-hosted stacks
Respond to production incidents

For Technical Leaders

Understand total cost of ownership
Plan migration to self-hosted LLMs
Meet compliance requirements (SOC 2, GDPR, HIPAA)
Roll out features safely with progressive enhancement

🛠 Technologies

All playbooks use production-grade tools and frameworks:

Languages: TypeScript (primary), Python, Bash
Infrastructure: Docker, Kubernetes, Prometheus, Grafana
Databases: PostgreSQL, Redis, ClickHouse
AI/ML: Ollama, llama.cpp, vLLM
Protocols: MCP (Model Context Protocol)
Compliance: GDPR, HIPAA, SOC 2, PCI DSS

📖 Reading Guide

New to Claude Code Plugins?

Start with:

Team Presets (08) - Understand collaboration patterns
MCP Reliability (03) - Core plugin architecture
Progressive Enhancement (10) - Safe rollout strategies

Building Large-Scale Agent Systems?

Essential path:

Advanced Tool Use (11) - Dynamic discovery, programmatic orchestration ⭐ NEW
Multi-Agent Rate Limits (01) - Prevent API throttling
Cost Attribution (09) - Track usage across features

Building Production Systems?

Focus on:

Self-Hosted Stack (06) - Infrastructure foundation
Cost Attribution (09) - Financial visibility
Incident Debugging (05) - Operations readiness

Enterprise Compliance?

Essential reads:

Compliance & Audit (07) - Regulatory requirements
Cost Caps (02) - Budget governance
Team Presets (08) - Access controls

Migrating from Cloud to Self-Hosted?

Migration path:

Ollama Migration (04) - Local LLM setup
Self-Hosted Stack (06) - Full infrastructure
Cost Attribution (09) - Compare cloud vs self-hosted costs

🔗 Related Resources

Learning Lab - Hands-on tutorials for agent workflow patterns
Plugin Marketplace - 258 plugins across 18 categories
MCP Plugins - Production MCP server implementations
Templates - Starter templates for new plugins

📝 Playbook Format

Each playbook follows a consistent structure:

Introduction - Problem statement and overview
Core Concepts - Fundamental principles
Architecture - System design patterns
Implementation - Production-ready code with TypeScript
Configuration - Setup and deployment guides
Monitoring - Observability and metrics
Best Practices - DO/DON'T guidelines
Troubleshooting - Common issues and solutions
Tools & Resources - Recommended tools and further reading
Summary - Key takeaways and checklist

🚀 Quick Start

# Clone repository
git clone https://github.com/jeremylongshore/claude-code-plugins.git
cd claude-code-plugins/docs/playbooks/

# Read a playbook
cat 01-multi-agent-rate-limits.md

# Or browse online
# https://github.com/jeremylongshore/claude-code-plugins/tree/main/docs/playbooks

🤝 Contributing

Found an issue or want to improve a playbook? Contributions welcome!

Open an issue describing the problem or improvement
Submit a pull request with detailed changes
Include code examples and real-world evidence

📄 License

All playbooks are released under the MIT License. Use them freely in your commercial and open-source projects.

Last Updated: December 24, 2025 Version: 1.0.0 Author: Jeremy Longshore (jeremy@intentsolutions.io) Repository: https://github.com/jeremylongshore/claude-code-plugins

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Production Playbooks

📚 Complete Playbook Collection

AI Architecture & Tool Use

Cost Management & Optimization

Infrastructure & Deployment

Operations & Reliability

Compliance & Governance

📊 Statistics

🎯 Use Cases

For Plugin Developers

For Engineering Teams

For Technical Leaders

🛠 Technologies

📖 Reading Guide

New to Claude Code Plugins?

Building Large-Scale Agent Systems?

Building Production Systems?

Enterprise Compliance?

Migrating from Cloud to Self-Hosted?

🔗 Related Resources

📝 Playbook Format

🚀 Quick Start

🤝 Contributing

📄 License

FilesExpand file tree

206-DR-SOPS-readme.md

Latest commit

History

206-DR-SOPS-readme.md

File metadata and controls

Production Playbooks

📚 Complete Playbook Collection

AI Architecture & Tool Use

Cost Management & Optimization

Infrastructure & Deployment

Operations & Reliability

Compliance & Governance

📊 Statistics

🎯 Use Cases

For Plugin Developers

For Engineering Teams

For Technical Leaders

🛠 Technologies

📖 Reading Guide

New to Claude Code Plugins?

Building Large-Scale Agent Systems?

Building Production Systems?

Enterprise Compliance?

Migrating from Cloud to Self-Hosted?

🔗 Related Resources

📝 Playbook Format

🚀 Quick Start

🤝 Contributing

📄 License