This directory contains production-ready deployment configurations and scripts for deploying LLM Shield API to AWS, GCP, and Azure.
# Set environment variables
export AWS_REGION=us-east-1
export IMAGE_TAG=v1.0.0
# Run deployment script
chmod +x deploy-aws.sh
./deploy-aws.sh# Set environment variables
export GCP_PROJECT=llm-shield-prod
export GCP_REGION=us-central1
export DEPLOY_TARGET=cloud-run # or 'gke'
export IMAGE_TAG=v1.0.0
# Run deployment script
chmod +x deploy-gcp.sh
./deploy-gcp.sh# Set environment variables
export AZURE_RESOURCE_GROUP=llm-shield-rg
export AZURE_LOCATION=eastus
export ACR_NAME=llmshieldacr
export DEPLOY_TARGET=container-apps # or 'aks'
export IMAGE_TAG=v1.0.0
# Run deployment script
chmod +x deploy-azure.sh
./deploy-azure.shβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Internet Gateway β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Application Load Balancer β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ECS Fargate Service β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β Task 1 β β Task 2 β β Task 3 β β
β β (2 vCPU) β β (2 vCPU) β β (2 vCPU) β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β β β
ββββββββββΌβββββββββ βββββββββΌβββββββ ββββββββββΌβββββββββ
β Secrets Manager β β S3 Bucket β β CloudWatch β
β - JWT Secret β β - ML Models β β - Metrics/Logs β
βββββββββββββββββββ ββββββββββββββββ βββββββββββββββββββ
Components:
- ECS Fargate: Serverless container orchestration
- Application Load Balancer: Traffic distribution with SSL termination
- Secrets Manager: Secure API keys and JWT secrets
- S3: ML model storage
- CloudWatch: Metrics and centralized logging
- IAM Roles: Least-privilege access control
Cost Estimate: ~$150-300/month (3 tasks, moderate traffic)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Cloud Load Balancer β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Cloud Run Service β
β Auto-scales 1-10 instances based on traffic β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β βInstance 1β βInstance 2β βInstance 3β β
β β (2 vCPU) β β (2 vCPU) β β (2 vCPU) β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β β β
ββββββββββΌβββββββββ βββββββββΌβββββββ ββββββββββΌβββββββββ
β Secret Manager β βCloud Storage β βCloud Monitoring β
β - JWT Secret β β - ML Models β β - Metrics/Logs β
βββββββββββββββββββ ββββββββββββββββ βββββββββββββββββββ
Components:
- Cloud Run: Fully managed serverless platform
- Cloud Load Balancer: Global load balancing with SSL
- Secret Manager: Managed secret storage
- Cloud Storage: Object storage for models
- Cloud Monitoring: Unified observability
- Workload Identity: Secure service-to-service authentication
Cost Estimate: ~$100-200/month (pay-per-use, scales to zero)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Cloud Load Balancer β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GKE Cluster (3-10 nodes) β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
β β llm-shield-api Deployment β β
β β ββββββββββββ ββββββββββββ ββββββββββββ β β
β β β Pod 1 β β Pod 2 β β Pod 3 β β β
β β ββββββββββββ ββββββββββββ ββββββββββββ β β
β β HPA: 3-10 replicas based on CPU/memory β β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β β β
ββββββββββΌβββββββββ βββββββββΌβββββββ ββββββββββΌβββββββββ
β Secret Manager β βCloud Storage β βCloud Monitoring β
β - JWT Secret β β - ML Models β β - Metrics/Logs β
βββββββββββββββββββ ββββββββββββββββ βββββββββββββββββββ
Cost Estimate: ~$200-400/month (3-node cluster)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Azure Application Gateway β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Azure Container Apps Service β
β Auto-scales 1-10 instances based on HTTP traffic β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β βInstance 1β βInstance 2β βInstance 3β β
β β (2 vCPU) β β (2 vCPU) β β (2 vCPU) β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β β β
ββββββββββΌβββββββββ βββββββββΌβββββββ ββββββββββΌβββββββββ
β Key Vault β β Blob Storage β β Azure Monitor β
β - JWT Secret β β - ML Models β β - Metrics/Logs β
βββββββββββββββββββ ββββββββββββββββ βββββββββββββββββββ
Components:
- Azure Container Apps: Serverless Kubernetes-based platform
- Application Gateway: Application-level load balancing
- Key Vault: Managed secret and key storage
- Blob Storage: Object storage for models
- Azure Monitor: Comprehensive monitoring solution
- Managed Identity: Azure AD-based authentication
Cost Estimate: ~$120-250/month (1-10 instances, moderate traffic)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Azure Load Balancer β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AKS Cluster (3-10 nodes) β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
β β llm-shield-api Deployment β β
β β ββββββββββββ ββββββββββββ ββββββββββββ β β
β β β Pod 1 β β Pod 2 β β Pod 3 β β β
β β ββββββββββββ ββββββββββββ ββββββββββββ β β
β β HPA: 3-10 replicas based on CPU/memory β β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β β β
ββββββββββΌβββββββββ βββββββββΌβββββββ ββββββββββΌβββββββββ
β Key Vault β β Blob Storage β β Azure Monitor β
β - JWT Secret β β - ML Models β β - Metrics/Logs β
βββββββββββββββββββ ββββββββββββββββ βββββββββββββββββββ
Cost Estimate: ~$250-500/month (3-node cluster)
All deployments support configuration via environment variables:
Common:
RUST_LOG=info # Logging level
LLM_SHIELD_API__SERVER__PORT=8080
LLM_SHIELD_API__SERVER__HOST=0.0.0.0AWS:
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=... # For local testing only
AWS_SECRET_ACCESS_KEY=... # For local testing onlyGCP:
GCP_PROJECT=llm-shield-prod
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json # For local testingAzure:
AZURE_TENANT_ID=... # For service principal auth
AZURE_CLIENT_ID=...
AZURE_CLIENT_SECRET=...Each cloud provider has a dedicated TOML configuration file:
config-aws.toml- AWS configurationconfig-gcp.toml- GCP configurationconfig-azure.toml- Azure configuration
Configuration precedence (highest to lowest):
- Environment variables (prefixed with
LLM_SHIELD_API__) - TOML configuration file
- Built-in defaults
# Install AWS CLI
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
# Configure credentials
aws configure
# Create ECR repository
aws ecr create-repository --repository-name llm-shield-api --region us-east-1
# Create ECS cluster
aws ecs create-cluster --cluster-name llm-shield-cluster --region us-east-1
# Create IAM role (see AWS documentation)
# - ecsTaskExecutionRole (for ECS)
# - LLMShieldAPIRole (for application permissions)# Install gcloud CLI
curl https://sdk.cloud.google.com | bash
exec -l $SHELL
# Initialize and authenticate
gcloud init
gcloud auth login
# Set project
gcloud config set project llm-shield-prod
# Enable required APIs
gcloud services enable \
run.googleapis.com \
container.googleapis.com \
secretmanager.googleapis.com \
storage.googleapis.com \
monitoring.googleapis.com \
logging.googleapis.com
# Create service account
gcloud iam service-accounts create llm-shield-api \
--display-name="LLM Shield API Service Account"
# Grant roles
gcloud projects add-iam-policy-binding llm-shield-prod \
--member="serviceAccount:llm-shield-api@llm-shield-prod.iam.gserviceaccount.com" \
--role="roles/secretmanager.secretAccessor"
gcloud projects add-iam-policy-binding llm-shield-prod \
--member="serviceAccount:llm-shield-api@llm-shield-prod.iam.gserviceaccount.com" \
--role="roles/storage.objectViewer"
gcloud projects add-iam-policy-binding llm-shield-prod \
--member="serviceAccount:llm-shield-api@llm-shield-prod.iam.gserviceaccount.com" \
--role="roles/monitoring.metricWriter"# Install Azure CLI
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
# Login
az login
# Create resource group
az group create \
--name llm-shield-rg \
--location eastus
# Create container registry
az acr create \
--resource-group llm-shield-rg \
--name llmshieldacr \
--sku Standard
# Create custom RBAC role
az role definition create \
--role-definition @../../crates/llm-shield-cloud-azure/rbac-roles/llm-shield-full-role.json
# For Container Apps: Create environment
az containerapp env create \
--name llm-shield-env \
--resource-group llm-shield-rg \
--location eastus
# For AKS: Create cluster
az aks create \
--resource-group llm-shield-rg \
--name llm-shield-aks \
--node-count 3 \
--node-vm-size Standard_D4s_v3 \
--enable-managed-identity \
--generate-ssh-keysNever hardcode secrets in configuration files or environment variables.
β Good:
jwt_secret = "${AWS_SECRET:llm-shield/jwt-secret}"β Bad:
jwt_secret = "my-secret-key-123"- Use private networking where possible (VPC, VNET)
- Enable TLS/SSL for all external endpoints
- Restrict ingress to known IP ranges
- Use security groups/firewall rules
AWS:
- Use IAM roles (not access keys)
- Enable MFA for root account
- Apply least-privilege permissions
GCP:
- Use Workload Identity (not service account keys)
- Enable Cloud IAM Recommender
- Use organization policies
Azure:
- Use Managed Identity (not service principals with secrets)
- Enable Azure AD PIM
- Apply Azure Policy
- Enable CloudTrail/Cloud Audit Logs/Activity Log
- Set up alerting for anomalous behavior
- Review access logs regularly
- Enable GuardDuty/Security Command Center/Defender
All Kubernetes deployments (GKE, AKS) include HPA configuration:
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80Cloud Run:
--min-instances 1 --max-instances 10Azure Container Apps:
--min-replicas 1 --max-replicas 10All deployments expose health check endpoints:
# Liveness probe
curl http://localhost:8080/health
# Readiness probe (same endpoint)
curl http://localhost:8080/healthPrometheus metrics are exposed on port 9090:
curl http://localhost:9090/metricsKey metrics:
http_requests_total- Total HTTP requestshttp_request_duration_seconds- Request latencyscan_duration_seconds- Scanner execution timecache_hits_total- Cache hit rate
Structured JSON logs are sent to:
- AWS: CloudWatch Logs
- GCP: Cloud Logging
- Azure: Log Analytics
Query examples:
AWS CloudWatch Insights:
fields @timestamp, level, message
| filter level = "ERROR"
| sort @timestamp desc
GCP:
resource.type="cloud_run_revision"
severity="ERROR"
Azure (KQL):
LLMShieldAPI_CL
| where Level == "ERROR"
| order by TimeGenerated desc1. Authentication Errors
# AWS
aws sts get-caller-identity
# GCP
gcloud auth list
gcloud auth application-default print-access-token
# Azure
az account show2. Image Pull Errors
# AWS
aws ecr get-login-password --region us-east-1
# GCP
gcloud auth configure-docker
# Azure
az acr login --name llmshieldacr3. Permission Errors
Check IAM roles/RBAC assignments for the compute service account.
- Use spot/preemptible instances for non-production workloads
- Enable autoscaling to scale down during low traffic
- Use reserved/committed use discounts for production
- Optimize container images (multi-stage builds, alpine base)
- Enable compression for API responses
- Use caching aggressively
- Monitor and rightsize resource requests/limits
- Set up CI/CD pipelines (GitHub Actions, Cloud Build, Azure Pipelines)
- Configure custom domains and SSL certificates
- Set up monitoring dashboards
- Implement backup and disaster recovery
- Configure WAF rules
- Set up staging environments
- Performance testing and optimization
- Security scanning and compliance
For issues and questions:
- GitHub Issues: https://github.com/llm-shield/llm-shield-rs/issues
- Documentation: https://docs.llmshield.dev
- Email: support@llmshield.dev