This document defines code style guidelines, review criteria, project-specific rules, and preferred patterns for the VoidRunner distributed task execution platform.
VoidRunner is a distributed task execution platform designed for secure, scalable code execution. The project follows an incremental development approach through well-defined Epic milestones.
- Backend: Go + Gin framework + PostgreSQL (pgx driver)
- API: RESTful API with JWT authentication and comprehensive task management
- Database: PostgreSQL with optimized schema and cursor pagination
- Container Execution: Docker executor with comprehensive security controls
- Queue System: Redis-based task queuing with retry logic and dead letter handling
- Worker Management: Embedded worker pool with concurrency controls and health monitoring
- Testing: 80%+ code coverage with unit and integration tests
- Documentation: OpenAPI/Swagger specs with comprehensive examples
- Distributed Services: Separate API and worker services for horizontal scaling
- Frontend: Svelte + SvelteKit + TypeScript web interface
- Infrastructure: Kubernetes (GKE) deployment with microservices
- Log Streaming: Real-time log collection and streaming
- Monitoring: Real-time metrics, logging, and alerting systems
voidrunner/
├── cmd/ # Application entrypoints
│ ├── api/ # ✅ API server main (implemented)
│ ├── migrate/ # ✅ Database migration tool (implemented)
│ └── scheduler/ # ✅ Scheduler service main (implemented - for future distributed mode)
├── internal/ # Private application code
│ ├── api/ # ✅ API handlers and routes (implemented)
│ ├── auth/ # ✅ Authentication logic (implemented)
│ ├── config/ # ✅ Configuration management (implemented)
│ ├── database/ # ✅ Database layer (implemented)
│ ├── models/ # ✅ Data models (implemented)
│ ├── services/ # ✅ Business logic services (implemented)
│ ├── executor/ # ✅ Task execution engine (implemented)
│ ├── queue/ # ✅ Redis queue integration (implemented)
│ └── worker/ # ✅ Worker management (implemented)
├── pkg/ # Public libraries
│ ├── logger/ # ✅ Structured logging (implemented)
│ ├── utils/ # ✅ Shared utilities (implemented)
│ └── metrics/ # 📋 Prometheus metrics (planned - Epic 4)
├── api/ # ✅ API specifications (OpenAPI) (implemented)
├── migrations/ # ✅ Database migrations (implemented)
├── tests/ # ✅ Integration tests (implemented)
├── scripts/ # ✅ Build and deployment scripts (implemented)
├── docs/ # ✅ Documentation (implemented)
├── deployments/ # 📋 Kubernetes manifests (planned - Epic 3)
└── frontend/ # 📋 Svelte web interface (planned - Epic 3)
Epic 1: Core API Infrastructure ✅ Complete
- JWT authentication system
- Task management CRUD operations
- PostgreSQL database with pgx
- Comprehensive testing suite
- OpenAPI documentation
Epic 2: Container Execution Engine ✅ Complete
- Docker client integration with security controls
- Task execution workflow and state management
- Embedded worker pool with concurrency management
- Redis-based queue system with retry logic
- Health monitoring and cleanup mechanisms
Epic 3: Frontend Interface 📋 Planned
- Svelte project setup and architecture
- Authentication UI and user management
- Task creation and management interface
- Real-time task status updates
Epic 4: Advanced Features 📋 Planned
- Distributed services architecture (Issue #46)
- Real-time log collection and streaming (Issue #11)
- Enhanced error handling mechanisms (Issue #12)
- Collaborative features and sharing
- Advanced search and filtering
- Real-time dashboard and system metrics
- Advanced notifications and alerting
- Issue #3: PostgreSQL Database Setup and Schema Design ✅ Closed
- Issue #4: JWT Authentication System Implementation ✅ Closed
- Issue #5: Task Management API Endpoints ✅ Closed
- Issue #6: API Documentation and Testing Infrastructure ✅ Closed
- Issue #9: Docker Client Integration and Security Configuration ✅ Closed
- Issue #10: Task Execution Workflow and State Management ✅ Closed
Epic 2 Enhancements (Non-blocking improvements)
- Issue #11: Log Collection and Real-time Streaming 📋 Open (Priority 1)
- Issue #12: Error Handling and Cleanup Mechanisms 📋 Open (Priority 2)
Note: Issues #11-12 are enhancements to the completed Epic 2 functionality, not blockers. The core container execution engine with embedded workers is fully operational.
- Issue #22: Frontend Interface 📋 Open
- Issue #23: Svelte Project Setup and Architecture 📋 Open (Priority 0)
- Issue #24: Authentication UI and User Management 📋 Open (Priority 0)
- Issue #25: Task Creation and Management Interface 📋 Open (Priority 0)
- Issue #26: Real-time Task Status Updates 📋 Open (Priority 0)
- Issue #27: Real-time Features 📋 Open
- Issue #28: Real-time Dashboard and System Metrics 📋 Open (Priority 0)
- Issue #29: Advanced Notifications and Alerting 📋 Open (Priority 1)
- Issue #30: Advanced Search and Filtering 📋 Open (Priority 1)
- Issue #31: Collaborative Features and Sharing 📋 Open (Priority 2)
- Issue #46: Separate API and Worker Services for Horizontal Scaling 📋 Open
- This epic will transition from embedded workers to distributed services
- Currently tracked for future implementation when scaling requirements emerge
With Epic 1-2 complete, the project has a fully functional task execution platform with embedded workers. The next logical step is Epic 3 (Frontend Interface) to provide a web-based user interface, followed by Epic 4 advanced features and eventual transition to distributed services (Issue #46).
- Packages: lowercase, single words when possible (
auth,database,executor) - Functions: CamelCase for exported, camelCase for private
- Constants: ALL_CAPS for package-level constants
- Interfaces: Add "er" suffix (
TaskExecutor,LogStreamer)
// PREFERRED: Structured error handling with context
func (s *TaskService) CreateTask(ctx context.Context, req CreateTaskRequest) (*Task, error) {
if err := s.validateTaskRequest(req); err != nil {
return nil, fmt.Errorf("validation failed: %w", err)
}
task, err := s.repo.CreateTask(ctx, req)
if err != nil {
return nil, fmt.Errorf("failed to create task: %w", err)
}
return task, nil
}
// AVOID: Generic error messages without context
func (s *TaskService) CreateTask(req CreateTaskRequest) (*Task, error) {
task, err := s.repo.CreateTask(req)
if err != nil {
return nil, err // Too generic
}
return task, nil
}// PREFERRED: Use pgx with prepared statements and proper error handling
func (r *TaskRepository) GetTaskByID(ctx context.Context, taskID string) (*Task, error) {
query := `
SELECT id, name, description, status, created_at, updated_at
FROM tasks
WHERE id = $1 AND deleted_at IS NULL
`
var task Task
err := r.pool.QueryRow(ctx, query, taskID).Scan(
&task.ID, &task.Name, &task.Description,
&task.Status, &task.CreatedAt, &task.UpdatedAt,
)
if err != nil {
if errors.Is(err, pgx.ErrNoRows) {
return nil, ErrTaskNotFound
}
return nil, fmt.Errorf("failed to get task %s: %w", taskID, err)
}
return &task, nil
}// PREFERRED: Constructor pattern with interfaces
type TaskService struct {
repo TaskRepository
executor TaskExecutor
logger *slog.Logger
metrics *prometheus.Registry
}
func NewTaskService(
repo TaskRepository,
executor TaskExecutor,
logger *slog.Logger,
metrics *prometheus.Registry,
) *TaskService {
return &TaskService{
repo: repo,
executor: executor,
logger: logger,
metrics: metrics,
}
}// PREFERRED: Always pass context as first parameter
func (s *TaskService) ExecuteTask(ctx context.Context, taskID string) error {
// Check context cancellation
select {
case <-ctx.Done():
return ctx.Err()
default:
}
// Use context in downstream calls
task, err := s.repo.GetTaskByID(ctx, taskID)
if err != nil {
return err
}
return s.executor.Execute(ctx, task)
}// REQUIRED: All container executions must use security constraints
func (e *DockerExecutor) Execute(ctx context.Context, task *Task) error {
containerConfig := &container.Config{
Image: e.getExecutorImage(task.Language),
User: "1000:1000", // REQUIRED: Non-root execution
WorkingDir: "/tmp/workspace",
Env: e.sanitizeEnvironment(task.Environment),
}
hostConfig := &container.HostConfig{
Resources: container.Resources{
Memory: task.MemoryLimit,
CPUQuota: task.CPUQuota,
PidsLimit: ptr(int64(128)), // REQUIRED: Limit processes
},
SecurityOpt: []string{
"no-new-privileges",
"seccomp=/opt/voidrunner/seccomp-profile.json",
},
NetworkMode: "none", // REQUIRED: No network access
ReadonlyRootfs: true, // REQUIRED: Read-only filesystem
AutoRemove: true, // REQUIRED: Automatic cleanup
}
return e.executeWithTimeout(ctx, containerConfig, hostConfig, task.Timeout)
}// REQUIRED: Validate all user inputs
func validateTaskRequest(req CreateTaskRequest) error {
if strings.TrimSpace(req.Name) == "" {
return ErrTaskNameRequired
}
if len(req.Name) > 255 {
return ErrTaskNameTooLong
}
if !isValidLanguage(req.Language) {
return ErrUnsupportedLanguage
}
if len(req.Code) > MaxCodeSize {
return ErrCodeTooLarge
}
// Sanitize code content
if containsDangerousPatterns(req.Code) {
return ErrDangerousCodePattern
}
return nil
}// PREFERRED: Use structured logging with context
func (s *TaskService) CreateTask(ctx context.Context, req CreateTaskRequest) (*Task, error) {
logger := s.logger.With(
"operation", "create_task",
"user_id", getUserID(ctx),
"task_name", req.Name,
)
logger.Info("creating new task")
task, err := s.repo.CreateTask(ctx, req)
if err != nil {
logger.Error("failed to create task", "error", err)
return nil, err
}
logger.Info("task created successfully", "task_id", task.ID)
return task, nil
}- DEBUG: Detailed flow information for troubleshooting
- INFO: General operational information
- WARN: Something unexpected happened but system continues
- ERROR: Error condition that needs attention
// File: internal/api/task_handler_test.go
package api
import (
"context"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestTaskHandler_CreateTask(t *testing.T) {
tests := []struct {
name string
request CreateTaskRequest
mockSetup func(*MockTaskService)
expectedStatus int
expectedError string
}{
{
name: "successful task creation",
request: CreateTaskRequest{
Name: "test-task",
Language: "python",
Code: "print('hello')",
},
mockSetup: func(m *MockTaskService) {
m.On("CreateTask", mock.Anything, mock.Anything).
Return(&Task{ID: "123"}, nil)
},
expectedStatus: 201,
},
// More test cases...
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Test implementation
})
}
}Unit Tests (Located in internal/package/*_test.go)
- Test individual functions and methods in isolation
- Mock external dependencies (databases, Redis, HTTP clients)
- Test validation logic, business rules, and error handling
- Should run fast (< 1 second total)
- No external service dependencies
// UNIT TEST: Tests validation logic only
func TestLogConfigValidation(t *testing.T) {
invalidConfig := &LogConfig{
BufferSize: -1, // Invalid
}
err := invalidConfig.Validate()
assert.Error(t, err)
assert.Contains(t, err.Error(), "buffer_size must be positive")
}Integration Tests (Located in tests/integration/*_test.go)
- Test interactions between multiple components
- Test with real external dependencies (PostgreSQL, Redis, Docker)
- Test system behavior under failure conditions
- Use build tag
//go:build integration - Package declaration:
package integration_test
//go:build integration
package integration_test
// INTEGRATION TEST: Tests Redis dependency interaction
func TestLoggingServiceDependencies(t *testing.T) {
service, err := logging.NewRedisStreamingService(nil, config, logger)
assert.Nil(t, service)
assert.Error(t, err)
assert.Contains(t, err.Error(), "redis client is required")
}Test Organization Rules
- Unit tests stay co-located with the package they test
- Integration tests go in
tests/integration/ - Use descriptive test names:
TestComponentName_Functionality - Group related tests in the same file
- Use build tags to separate unit from integration tests
// REQUIRED: Integration tests for critical paths
func TestTaskExecution_Integration(t *testing.T) {
if testing.Short() {
t.Skip("skipping integration test")
}
// Setup test database
db := setupTestDB(t)
defer cleanupTestDB(t, db)
// Setup test containers
executor := setupTestExecutor(t)
defer cleanupTestExecutor(t, executor)
// Test execution flow
service := NewTaskService(db, executor, logger)
task, err := service.CreateTask(context.Background(), CreateTaskRequest{
Name: "integration-test",
Language: "python",
Code: "print('integration test')",
})
require.NoError(t, err)
err = service.ExecuteTask(context.Background(), task.ID)
require.NoError(t, err)
// Verify execution results
result, err := service.GetTaskResult(context.Background(), task.ID)
require.NoError(t, err)
assert.Equal(t, "completed", result.Status)
}# REQUIRED: All deployments must specify resource limits
apiVersion: apps/v1
kind: Deployment
metadata:
name: voidrunner-api
spec:
template:
spec:
containers:
- name: api
image: voidrunner/api:latest
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "500m"
# REQUIRED: Security context
securityContext:
allowPrivilegeEscalation: false
runAsNonRoot: true
runAsUser: 1000
readOnlyRootFilesystem: true# REQUIRED: All services must have health checks
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 2- Security: No hardcoded secrets, proper input validation
- Error Handling: All errors properly wrapped with context
- Testing: Unit tests for new functionality, integration tests for critical paths
- Performance: Database queries optimized, no N+1 problems
- Logging: Structured logging with appropriate levels
- Documentation: Public functions and complex logic documented
- API response times: < 200ms for 95% of requests
- Database queries: < 50ms median response time
- Container startup: < 5 seconds for cold starts
- Memory usage: < 1GB per API instance
- All user inputs validated and sanitized
- Container execution with security constraints
- Secrets managed through Kubernetes secrets
- No privilege escalation in containers
- Network policies enforced
feature/issue-number-short-descriptionbugfix/issue-number-short-descriptionhotfix/issue-number-short-description
type(scope): short description
Longer description if needed
Fixes #123
Types: feat, fix, docs, style, refactor, test, chore
Scopes: api, frontend, executor, scheduler, k8s, security
- All CI checks passing
- Code coverage maintains > 80%
- Security scan passes
- Documentation updated
- Breaking changes documented
# config/development.yaml
database:
host: localhost
port: 5432
ssl_mode: disable
executor:
timeout: 30s
memory_limit: 512Mi
logging:
level: debug
format: console# config/production.yaml
database:
host: ${DB_HOST}
port: 5432
ssl_mode: require
executor:
timeout: 3600s
memory_limit: 1Gi
logging:
level: info
format: jsonTesting configuration is unified between CI and local environments for consistency:
# Integration test environment variables (used by both CI and local)
TEST_DB_HOST=localhost
TEST_DB_PORT=5432
TEST_DB_USER=testuser
TEST_DB_PASSWORD=testpassword
TEST_DB_NAME=voidrunner_test
TEST_DB_SSLMODE=disable
JWT_SECRET_KEY=test-secret-key-for-integrationKey Principles:
- Unified Configuration: Same database and JWT settings for CI and local testing
- Environment Detection:
CI=trueused only for output formats (SARIF, coverage) - Database Independence: Tests automatically skip when database unavailable
- Consistent Behavior: Integration tests behave identically in both environments
// Repository pattern with interfaces
type TaskRepository interface {
CreateTask(ctx context.Context, task *Task) error
GetTask(ctx context.Context, id string) (*Task, error)
UpdateTaskStatus(ctx context.Context, id string, status TaskStatus) error
}
// Service layer with dependency injection
type TaskService struct {
repo TaskRepository
exec TaskExecutor
}
// Proper context cancellation handling
func (s *Service) LongRunningOperation(ctx context.Context) error {
for {
select {
case <-ctx.Done():
return ctx.Err()
default:
// Continue processing
}
}
}// DON'T: Global variables
var GlobalDB *sql.DB
// DON'T: Panic in library code
func ProcessTask(task *Task) {
if task == nil {
panic("task is nil") // Use error returns instead
}
}
// DON'T: Ignoring errors
result, _ := dangerousOperation() // Always handle errors
// DON'T: Magic numbers
time.Sleep(300 * time.Second) // Use named constants// REQUIRED: Add metrics for all critical operations
var (
taskExecutionDuration = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Name: "voidrunner_task_execution_duration_seconds",
Help: "Time spent executing tasks",
},
[]string{"task_type", "status"},
)
)
func (s *TaskService) ExecuteTask(ctx context.Context, task *Task) error {
start := time.Now()
defer func() {
duration := time.Since(start)
taskExecutionDuration.WithLabelValues(task.Language, task.Status).Observe(duration.Seconds())
}()
return s.executor.Execute(ctx, task)
}// REQUIRED: Add tracing for complex operations
func (s *TaskService) ExecuteTask(ctx context.Context, taskID string) error {
ctx, span := tracer.Start(ctx, "TaskService.ExecuteTask")
defer span.End()
span.SetAttributes(attribute.String("task.id", taskID))
// Implementation...
}# Setup development environment
make setup
# Start development server with auto-reload
make dev
# Run tests
make test # Unit tests (with coverage in CI)
make test-fast # Fast unit tests (short mode)
make test-integration # Integration tests
make test-all # Both unit and integration tests
# Coverage analysis
make coverage # Generate coverage report
make coverage-check # Check coverage meets 80% threshold
# Code quality
make fmt # Format code
make vet # Run go vet
make lint # Run linting (with format check in CI)
make security # Security scan
# Build and run
make build # Build API server
make run # Run API server locally
# Documentation
make docs # Generate API docs
make docs-serve # Serve docs locally
# Development tools
make install-tools # Install development tools
make clean # Clean build artifacts
# Environment Management (Docker Compose)
make dev-up # Start development environment (DB + Redis + API)
make dev-down # Stop development environment
make dev-logs # Show development logs
make dev-restart # Restart development environment
make dev-status # Show development environment status
make prod-up # Start production environment
make prod-down # Stop production environment
make prod-logs # Show production logs
make prod-restart # Restart production environment
make prod-status # Show production environment status
make env-status # Show all environment status
make docker-clean # Clean Docker resources
# Test services management (PostgreSQL + Redis)
make services-start # Start test services (Docker)
make services-stop # Stop test services
make services-reset # Reset test services (clean slate)
# Database migrations
make migrate-up # Run database migrations up
make migrate-down # Run database migrations down (rollback one)
make migrate-reset # Reset database (rollback all migrations)
make migration name=X # Create new migration file
# Dependencies and setup
make deps # Download and tidy dependencies
make deps-update # Update dependencies
make setup # Setup complete development environment# Test services management (implemented)
make services-start # Start test services containers (PostgreSQL + Redis)
make services-stop # Stop test services containers
make services-reset # Reset test services to clean state
# Migration management (implemented)
make migrate-up # Apply all pending migrations
make migrate-down # Rollback last migration
make migrate-reset # Rollback all migrations
make migration name=add_feature # Create new migration files
# Legacy scripts (planned for Epic 2)
./scripts/backup-db.sh production # Database backup utility
./scripts/restore-db.sh backup.sql # Database restore utility- OpenAPI specifications for all endpoints
- Include request/response examples
- Document error codes and meanings
- Rate limiting information
// TaskExecutor handles the execution of user-submitted code in secure containers.
// It manages the complete lifecycle from container creation to cleanup.
//
// Example usage:
// executor := NewDockerExecutor(client, logger)
// result, err := executor.Execute(ctx, task)
// if err != nil {
// return fmt.Errorf("execution failed: %w", err)
// }
type TaskExecutor interface {
// Execute runs the given task in a secure container environment.
// It returns the execution result or an error if execution fails.
Execute(ctx context.Context, task *Task) (*ExecutionResult, error)
}- Use semantic versioning:
v1.2.3 - Tag format:
git tag -a v1.2.3 -m "Release v1.2.3"
- All tests passing
- Security scan completed
- Database migrations tested
- Rollback plan prepared
- Monitoring dashboards updated
- Documentation updated
Document Version: 1.1
Last Updated: 2025-07-10
Next Review: 2025-08-10
For questions about these guidelines, please reach out to the technical lead or create an issue in the repository.