A composable AI agent framework for Go that makes it easy to build production-ready AI applications.
Developed by Calque AI
Building AI apps in Go means wrestling with:
- Provider lock-in - Switching between OpenAI, Gemini, or local models requires rewriting code
- Conversation state - Managing chat history and context windows across requests
- Tool calling - Connecting AI to your Go functions with proper error handling
- Structured outputs - Getting reliable JSON responses that match your types
- RAG pipelines - Coordinating document retrieval, embedding, and generation
Go-Calque solves these with a simple, composable middleware pattern that feels native to Go.
go get github.com/calque-ai/go-calquepackage main
import (
"context"
"fmt"
"log"
"github.com/calque-ai/go-calque/pkg/calque"
"github.com/calque-ai/go-calque/pkg/middleware/ai"
"github.com/calque-ai/go-calque/pkg/middleware/ai/ollama"
)
func main() {
// Initialize AI client
client, err := ollama.New("llama3.2:3b")
if err != nil {
log.Fatal(err)
}
// Create flow and run
flow := calque.NewFlow().Use(ai.Agent(client))
var result string
err = flow.Run(context.Background(), "What's the capital of France?", &result)
if err != nil {
log.Fatal(err)
}
fmt.Println(result)
}That's it. Three lines to set up, one line to run.
Go-Calque grows with your needs. Start simple, add capabilities as required.
client, _ := ollama.New("llama3.2:1b")
convMem := memory.NewConversation()
flow := calque.NewFlow().
Use(convMem.Input("user123")). // Store user input
Use(ai.Agent(client)).
Use(convMem.Output("user123")) // Store AI response
// First message
flow.Run(ctx, "My name is Alice", &response)
// Second message - AI remembers the conversation
flow.Run(ctx, "What's my name?", &response)
// Response: "Your name is Alice"// Create tools
calculator := tools.Simple("calculator", "Performs math calculations",
func(jsonArgs string) string {
var args struct{ Expression string `json:"expression"` }
json.Unmarshal([]byte(jsonArgs), &args)
// Calculate and return result
return evaluate(args.Expression)
})
weather := tools.Simple("get_weather", "Gets current weather for a city",
func(jsonArgs string) string {
var args struct{ City string `json:"city"` }
json.Unmarshal([]byte(jsonArgs), &args)
return fetchWeather(args.City)
})
// Agent automatically calls tools when needed
flow := calque.NewFlow().
Use(ai.Agent(client, ai.WithTools(calculator, weather)))
flow.Run(ctx, "What's the weather in Tokyo and what's 15% of 340?", &result)
// AI calls both tools and synthesizes the responsetype TaskAnalysis struct {
TaskType string `json:"task_type" jsonschema:"required,description=Type of task"`
Priority string `json:"priority" jsonschema:"required,enum=low;medium;high"`
Hours int `json:"hours" jsonschema:"description=Estimated hours"`
}
flow := calque.NewFlow().
Use(ai.Agent(client, ai.WithSchema(&TaskAnalysis{})))
var analysis TaskAnalysis
flow.Run(ctx, "Build a user authentication system",
convert.FromJSONSchema[TaskAnalysis](&analysis))
fmt.Printf("Type: %s, Priority: %s, Hours: %d\n",
analysis.TaskType, analysis.Priority, analysis.Hours)// Initialize vector store (Weaviate, Qdrant, or PGVector)
store := weaviate.New("http://localhost:8080", "Documents")
// Configure retrieval with diversity strategy
strategy := retrieval.StrategyDiverse
searchOpts := &retrieval.SearchOptions{
Threshold: 0.7,
Limit: 5,
Strategy: &strategy,
MaxTokens: 2000,
}
// Build RAG pipeline
flow := calque.NewFlow().
Use(retrieval.VectorSearch(store, searchOpts)). // Retrieve context
Use(prompt.Template(`Answer based on this context:
{{.Input}}
Question: {{.Query}}`)).
Use(ai.Agent(client))
flow.Run(ctx, "How do I configure authentication?", &result)| Challenge | Raw SDK Approach | Go-Calque Approach |
|---|---|---|
| Provider switching | Rewrite API calls, handle different response formats | Change one line: ollama.New() → openai.New() |
| Conversation memory | Manual state management, serialize/deserialize | convMem.Input() / convMem.Output() middleware |
| Tool calling | Parse responses, match functions, handle errors | ai.WithTools(...) - automatic discovery & execution |
| Retries & fallbacks | Custom retry loops, fallback logic | ctrl.Retry(handler, 3), ctrl.Fallback(primary, backup) |
| Structured output | Hope the AI follows instructions, validate manually | ai.WithSchema() - guaranteed valid JSON matching your types |
| RAG pipelines | Coordinate embeddings, search, prompt building | Chain middleware: VectorSearch → Template → Agent |
| Observability | Manual metrics, logging, tracing setup | Built-in: Metrics(), Tracing(), HealthCheck() |
| Testing | Mock HTTP clients, parse responses | Test each middleware independently |
Raw OpenAI SDK:
// 50+ lines: create client, build messages array, handle streaming,
// parse tool calls, execute functions, rebuild messages, retry on error,
// extract final response, handle rate limits...Go-Calque:
flow := calque.NewFlow().
Use(convMem.Input(userID)).
Use(ctrl.Retry(ai.Agent(client, ai.WithTools(myTools...)), 3)).
Use(convMem.Output(userID))
flow.Run(ctx, userMessage, &response)func main() {
client, _ := openai.New("gpt-4")
convMem := memory.NewConversation()
// Define tools
searchDocs := tools.Simple("search_docs", "Search documentation", searchHandler)
createTicket := tools.Simple("create_ticket", "Create support ticket", ticketHandler)
// Build chatbot flow
chatbot := calque.NewFlow().
Use(convMem.Input("session")).
Use(ctrl.Retry(
ai.Agent(client, ai.WithTools(searchDocs, createTicket)),
3,
)).
Use(convMem.Output("session"))
// Handle messages
http.HandleFunc("/chat", func(w http.ResponseWriter, r *http.Request) {
var req ChatRequest
json.NewDecoder(r.Body).Decode(&req)
var response string
if err := chatbot.Run(r.Context(), req.Message, &response); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
json.NewEncoder(w).Encode(ChatResponse{Message: response})
})
}func main() {
client, _ := ollama.New("llama3.2:3b")
store := qdrant.New("localhost:6334", "knowledge_base")
// Retrieval configuration
strategy := retrieval.StrategyRelevant
searchOpts := &retrieval.SearchOptions{
Threshold: 0.75,
Limit: 5,
Strategy: &strategy,
MaxTokens: 3000,
}
// RAG pipeline
ragFlow := calque.NewFlow().
Use(retrieval.VectorSearch(store, searchOpts)).
Use(prompt.Template(`You are a helpful assistant. Use the following context to answer questions.
Context:
{{.Input}}
Question: {{.Query}}
Answer based only on the provided context. If the answer isn't in the context, say so.`)).
Use(ai.Agent(client))
// API endpoint
http.HandleFunc("/ask", func(w http.ResponseWriter, r *http.Request) {
question := r.URL.Query().Get("q")
var answer string
if err := ragFlow.Run(r.Context(), question, &answer); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
fmt.Fprint(w, answer)
})
}// Create specialized agents
mathAgent := multiagent.Route(
ai.Agent(mathClient),
"math",
"Solve mathematical problems and calculations",
"calculate,solve,math,equation")
codeAgent := multiagent.Route(
ai.Agent(codeClient),
"code",
"Programming, debugging, code review",
"code,program,debug,function")
// Router automatically selects best agent
flow := calque.NewFlow().
Use(multiagent.Router(routerClient, mathAgent, codeAgent))
flow.Run(ctx, "What's the factorial of 10?", &result) // Routes to mathAgent
flow.Run(ctx, "Write a bubble sort in Go", &result) // Routes to codeAgent- AI Agents:
ai.Agent(client)- Connect to OpenAI, Gemini, Ollama, or custom providers - Prompt Templates:
prompt.Template("Question: {{.Input}}")- Dynamic prompt formatting - Structured Output:
ai.WithSchema(&MyType{})- Guaranteed JSON matching your types - Tool Calling:
ai.WithTools(tools...)- Automatic function discovery and execution
- Vector Search:
retrieval.VectorSearch(store, opts)- Semantic similarity search with context building- Multiple context strategies: Relevant, Recent, Diverse (MMR), Summary
- Token-limited context assembly with custom separators
- Adaptive similarity algorithms (Cosine, Jaccard, Jaro-Winkler, Hybrid)
- Document Loading:
retrieval.DocumentLoader(sources...)- Load documents from files and URLs- Glob pattern support for file paths
- Concurrent loading with worker pools
- Automatic metadata extraction
- Vector Store Interface: Provider-agnostic interface for multiple backends
- Weaviate, Qdrant, and PGVector client implementations
- Auto-embedding and external embedding provider support
- Native diversification (MMR) and reranking capabilities
- Conversation Memory: Track chat history with configurable limits
- Context Windows: Sliding window memory management for long conversations
- Storage Backends: In-memory, Badger, or add a custom storage adapter
- Timeouts:
ctrl.Timeout(handler, duration)- Prevent hanging operations - Retries:
ctrl.Retry(handler, attempts)- Handle transient failures - Fallbacks:
ctrl.Fallback(primary, backup)- Graceful degradation - Parallel Processing:
ctrl.Parallel(handlers...)- Concurrent execution - Chain Composition:
ctrl.Chain(handlers...)- Sequential middleware chains
- Function Calling: Execute Go functions from AI agents
- Tool Registry: Manage and discover available functions
- Concurrent Execution: Run multiple tools in parallel
- Error Handling: Configurable behavior when tools fail
- Agent Routing: Route requests to specialized agents based on content
- Load Balancing: Distribute load across multiple agent instances
- MCP Client: Connect to MCP servers to access tools, resources, and prompts
- Multiple Transports: Stdio, SSE, and StreamableHTTP support
- Native LLM Tool Calling: MCP tools converted to native LLM format for better accuracy
- Natural Language Usage: AI-powered tool discovery and execution
mcp.RegisterTools(client)- Register available MCP toolsmcp.DetectTools(client, llmClient)- AI-powered tool selectionmcp.ExtractToolParams(client, llmClient)- Extract parameters from user inputmcp.ExecuteTools(client)- Execute detected tools
- Response Caching:
cache.Cache(handler, ttl)- Cache handler responses with TTL - Pluggable Backends: In-memory store or custom storage adapters
-
Context Management (
calque/): Request tracking and metadata propagation- MetadataBus: Thread-safe metadata sharing between concurrent middleware
- Channel-based communication for concurrent flows
- Set/Get operations for immutable values (trace ID, request ID)
- Send/Receive patterns for streaming metadata between handlers
- Context Helpers:
calque.WithTraceID,calque.WithRequestIDfor request tracking - Context Propagation: Automatic metadata extraction and propagation through middleware chains
- MetadataBus: Thread-safe metadata sharing between concurrent middleware
-
Error Handling (
calque/): Context-aware structured errors- Context-Aware Errors:
calque.WrapErr(ctx, err, msg)andcalque.NewErr(ctx, msg)- Automatic trace ID and request ID propagation
- Full compatibility with Go's error wrapping (
errors.Is,errors.As,errors.Unwrap)
- Structured Error Logging: Automatic metadata enrichment with slog integration
- Errors carry trace ID, request ID, and custom tags
- Chainable tag methods for adding metadata
- Context-Aware Errors:
-
Metrics (
observability/): Performance and usage metrics- Metrics Collection:
observability.Metrics(provider, labels)- Collect performance metrics- Counters: total requests, errors
- Gauges: in-flight requests, active connections
- Histograms: request latencies, response sizes
- Prometheus Integration:
observability.NewPrometheusProvider()- Export metrics to Prometheus- Automatic recording: request counts, latencies, error rates, in-flight requests
- Custom labels: service name, version, environment for filtering in dashboards
- HTTP handler:
provider.Handler()for/metricsendpoint
- Metrics Collection:
-
Distributed Tracing (
observability/): Track requests across services- Tracing Middleware:
observability.Tracing(provider, "operation-name")- Create trace spans- Automatic timing, error tracking, and context propagation
- Custom attributes: add user IDs, order IDs, or any metadata to spans
- OTLP Support:
observability.NewOTLPTracerProvider()- Export to OTLP backends- Jaeger: Popular open-source tracing backend
- Grafana Tempo: Scalable tracing backend from Grafana
- Any OTLP-compatible collector (Honeycomb, Datadog, New Relic)
- Configurable sampling, batching, and TLS support
- Tracing Middleware:
-
Health Checks (
observability/): Monitor application dependencies- Health Check Middleware:
observability.HealthCheck(checks...)- Run dependency checks- Concurrent execution for fast response times
- JSON reports with overall status, individual check results, uptime
- Check Types:
- TCP checks:
observability.TCPHealthCheck- Verify database/cache connectivity - HTTP checks:
observability.HTTPHealthCheck- Verify API endpoints - Custom checks:
observability.FuncHealthCheck- Implement any health check logic
- TCP checks:
- Health Check Registry: Dynamic registration and management of health checks
- Health Check Middleware:
-
Logging (
calque/,inspect/): Structured logging with context- Context-Aware Logging (
calque/): Primary logging API with automatic metadata injectioncalque.LogInfo,calque.LogDebug,calque.LogWarn,calque.LogError- Level-based logging helperscalque.LogWith- Create logger with pre-attached context fieldscalque.LogAttr,calque.LogInfoAttr,calque.LogDebugAttr, etc. - Type-safe slog.Attr logging- Automatic trace_id and request_id appending from context
- Integration with slog for structured output
calque.WithLoggerfor custom logger configuration- Level-enabled checks for performance optimization
- Data Flow Inspection (
inspect/): Middleware for inspecting data streams in flowsinspect.Print(prefix)- Log complete input contentinspect.Head(prefix, bytes)- Log first N bytes for streaming previewinspect.Chunks(prefix, size)- Log streaming data in fixed-size chunksinspect.HeadTail(prefix, headBytes, tailBytes)- Log beginning and end of streamsinspect.Timing(prefix, handler)- Measure handler execution time and throughputinspect.Sampling(prefix, numSamples, sampleSize)- Distributed sampling across streams
- Context-Aware Logging (
Transform structured data at flow boundaries:
Input Converters (prepare data for processing):
convert.ToJSON(struct) // Struct → JSON stream
convert.ToYAML(struct) // Struct → YAML stream
convert.ToJSONSchema(struct) // Struct + schema → stream (for AI context)
convert.ToProtobuf(msg) // Proto message → binary stream
convert.ToSSE(data) // Data → Server-Sent Events streamOutput Converters (parse results):
convert.FromJSON(&result) // JSON stream → struct
convert.FromYAML(&result) // YAML stream → struct
convert.FromJSONSchema(&result) // JSON stream → struct (validates against schema)
convert.FromProtobuf(&result) // Binary stream → proto messageGo-Calque brings HTTP middleware patterns to AI and data processing. Instead of handling HTTP requests, you compose flows where each middleware processes data through io.Pipe connections.
flowchart TB
subgraph subGraph0["Streaming Pipeline<br>io.Pipe<br>Goroutines"]
D["Middleware 1<br>goroutine"]
C["Input"]
E["Middleware 2<br>AI Agent<br>goroutine"]
F["Middleware 3<br>goroutine"]
G["Output"]
end
subgraph subGraph1["AI Agent Processing"]
H{"Tool Calling<br>Required?"}
I["Tools Execute"]
J["Direct Response"]
K["Response Synthesis"]
L["Stream Output"]
end
subgraph subGraph2["Middleware Processing Modes"]
M(["Streaming<br>Real-time processing<br>as data arrives"])
N(["Buffered<br>Read all data first<br>for complex processing"])
end
subgraph subGraph3["LLM Providers"]
O["OpenAI"]
P["Ollama"]
Q["Gemini"]
end
A["User Application"] --> B["Calque Flow"]
B --> C
C --> D
D -- "io.Pipe" --> E
E -- "io.Pipe" --> F
F --> G
E --> H & O & P & Q
H -- Yes --> I
H -- No --> J
I --> K
J --> L
K --> L
D -.-> M
E -.-> N
F -.-> M
A:::inputOutput
B:::inputOutput
C:::inputOutput
D:::middleware
E:::aiCore
F:::middleware
G:::inputOutput
H:::decision
I:::decision
J:::decision
K:::decision
L:::decision
M:::modes
N:::modes
O:::llmProvider
P:::llmProvider
Q:::llmProvider
%% Dark Mode Styles
classDef inputOutput fill:#1e293b,stroke:#93c5fd,stroke-width:2px,color:#f9fafb
classDef middleware fill:#0f172a,stroke:#38bdf8,stroke-width:2px,color:#e0f2fe
classDef aiCore fill:#14532d,stroke:#22c55e,stroke-width:2px,color:#bbf7d0
classDef llmProvider fill:#7c2d12,stroke:#f97316,stroke-width:2px,color:#ffedd5
classDef modes fill:#312e81,stroke:#8b5cf6,stroke-width:2px,color:#ede9fe
classDef decision fill:#4c0519,stroke:#f43f5e,stroke-width:2px,color:#ffe4e6
🔄 Streaming Pipeline: Input → Middleware1 → Middleware2 → Middleware3 → Output connected by io.Pipe with each middleware running in its own goroutine
⚡ Concurrent Execution: Each middleware runs in its own goroutine with automatic backpressure handling
📊 Middleware Processing Modes:
- Streaming: Real-time processing as data arrives (no buffering)
- Buffered: Reads all data first for complex processing when needed
🔗 Context Propagation: Cancellation and timeouts flow through the entire chain
- Memory Efficient: Constant memory usage regardless of input size
- Real-time Processing: Responses begin immediately, no waiting for full datasets
- True Concurrency: Each middleware runs in its own goroutine
- Go-Idiomatic: Built with Go conventions using
io.Reader/io.Writer
Create your own middleware by implementing calque.HandlerFunc:
// Custom middleware that adds timestamps (BUFFERED - reads all input first)
func AddTimestamp(prefix string) calque.HandlerFunc {
return func(req *calque.Request, res *calque.Response) error {
// Read input using the Read helper
var input string
if err := calque.Read(req, &input); err != nil {
return err
}
// Transform data
timestamp := time.Now().Format("2006-01-02 15:04:05")
output := fmt.Sprintf("[%s %s] %s", prefix, timestamp, input)
// Write output using the Write helper
return calque.Write(res, output)
}
}
// Usage
flow := calque.NewFlow().
Use(AddTimestamp("LOG")).
Use(text.Transform(strings.ToUpper))For processing large data streams without buffering:
func StreamingProcessor() calque.HandlerFunc {
return func(req *calque.Request, res *calque.Response) error {
// Process data line by line
scanner := bufio.NewScanner(req.Data)
for scanner.Scan() {
line := scanner.Text()
processed := fmt.Sprintf("PROCESSED: %s\n", line)
if _, err := res.Data.Write([]byte(processed)); err != nil {
return err
}
}
return scanner.Err()
}
}// For high-traffic scenarios, limit goroutine creation
config := calque.FlowConfig{
MaxConcurrent: calque.ConcurrencyAuto, // Auto-scales with CPU cores
CPUMultiplier: 10, // 10x GOMAXPROCS
}
flow := calque.NewFlow(config).
Use(ai.Agent(client))flow := calque.NewFlow().
Use(ctrl.Retry(ai.Agent(client), 3)).
Use(ctrl.Fallback(
ai.Agent(primaryClient),
ai.Agent(backupClient),
))// Build reusable sub-flows
textPreprocessor := calque.NewFlow().
Use(text.Transform(strings.TrimSpace)).
Use(text.Transform(strings.ToLower))
textAnalyzer := calque.NewFlow().
Use(text.Transform(func(s string) string {
wordCount := len(strings.Fields(s))
return fmt.Sprintf("TEXT: %s\nWORDS: %d", s, wordCount)
}))
// Compose sub-flows into main flow
mainFlow := calque.NewFlow().
Use(textPreprocessor).
Use(text.Branch(
func(s string) bool { return len(s) > 50 },
textAnalyzer,
text.Transform(func(s string) string { return s + " [SHORT]" }),
))// Expose your flow as an HTTP endpoint
http.HandleFunc("/chat", func(w http.ResponseWriter, r *http.Request) {
var req ChatRequest
json.NewDecoder(r.Body).Decode(&req)
var response string
flow.Run(r.Context(), req.Message, &response)
json.NewEncoder(w).Encode(ChatResponse{Message: response})
})http.HandleFunc("/stream", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/event-stream")
sseConverter := convert.ToSSE(w, userID).
WithChunkMode(convert.SSEChunkByWord)
flow.Run(r.Context(), message, sseConverter)
})Go-Calque's optimized middleware composition delivers both performance and memory efficiency. Benchmarks from our anagram processing example show:
| Configuration | Dataset | Algorithm | Time (ns/op) | Memory (B/op) | Allocations | Time Improvement | Memory Improvement |
|---|---|---|---|---|---|---|---|
| VirtualApple @ 2.50GHz, darwin/amd64 | Small (29 words) | Baseline | 69,377 | 76,736 | 685 | - | - |
| Go-Calque | 51,964 | 32,343 | 479 | 25% faster | 58% less | ||
| Large (1000 words) | Baseline | 4,232,972 | 4,011,708 | 33,990 | - | - | |
| Go-Calque | 523,240 | 469,156 | 9,574 | 88% faster | 88% less | ||
| linux/amd64 x86_64 | Small (29 words) | Baseline | 51,617 | 76,736 | 685 | - | - |
| Go-Calque | 59,473 | 32,361 | 430 | 15% slower | 58% less | ||
| Large (1000 words) | Baseline | 3,105,624 | 4,011,673 | 33,990 | - | - | |
| Go-Calque | 537,898 | 469,359 | 5,489 | 83% faster | 88% less |
Performance Principle: Well-designed middleware composition outperforms hand-coded algorithms while remaining maintainable and composable.
Run the benchmarks: cd examples/anagram && go test -bench=.
Tool Calling - ✅ Function execution for AI agents Information Retrieval - ✅ Vector search, ✅ context building, ✅ semantic filtering Multi-Agent Collaboration - 🔲 Agent selection, ✅ load balancing, ✅ conditional routing Guardrails & Safety - 🔲 Input filtering, 🔲 output validation, ✅ schema compliance HTTP/API Integration - ✅ streaming responses Model Context Protocol - ✅ MCP client, ✅ natural language tools, ✅ StreamableHTTP Observability - ✅ Context management (MetadataBus), ✅ Error handling (context-aware errors), ✅ Metrics (Prometheus), ✅ Distributed Tracing (OTLP), ✅ Health Checks, ✅ Structured logging
Enhanced Memory - 🔲 Vector-based semantic memory retrieval Advanced Agents - 🔲 Planning, 🔲 reflection, 🔲 self-evaluation capabilities Additional Providers - 🔲 Anthropic/Claude support
Core Framework: ✅ basics, ✅ converters, ✅ converters-jsonschema, ✅ streaming-chats Data Processing: ✅ memory, ✅ batch-processing, ✅ flow-composition AI Agents: ✅ tool-calling, ✅ retrieval, 🔲 multi-agent-workflow, 🔲 guardrails-validation Advanced: ✅ web-api-agent, 🔲 human-in-the-loop
Batch Processing - 🔲 Splitters, 🔲 aggregators, ✅ parallel processors State Management - 🔲 State machines, 🔲 checkpoints, ✅ conditional flows Agent2Agent Protocol - 🔲 A2A server, 🔲 examples
- Fork the repository
- Create a feature branch
- Add tests for new middleware
- Submit a pull request
Thanks to all contributors who are helping to make Go-Calque better.
Mozilla Public License 2.0 - see LICENSE file for details.
