Everything you commit here is public on GitHub. Treat the repo as a public artifact at all times — feature flags don't hide source, "internal" naming doesn't hide source, and removing files later doesn't unpublish git history. Don't commit anything you wouldn't post publicly.
All [dependencies] in Cargo.toml and all imports in source must resolve from public sources (crates.io, this repo, or other open-source repos).
Thanks for contributing! If you're not sure whether something fits, open a discussion before submitting.
Always follow the coding guidelines in AGENTS.md — it covers error handling, testing, imports, and all Rust conventions for this project.
- First principles thinking — ask "why" until you truly understand the problem
- Root cause analysis — don't patch symptoms, find and fix the real bug
- No hacks — implement things properly, even if it takes longer
- Quality over speed — write code your future self will thank you for
- Keep it simple — the best code is the code you don't write
- Leave code better than you found it — improve something small in every change
- Explicit over implicit — if it's not obvious, make it obvious
- Fail fast, fail loud — surface errors immediately, don't swallow them
Use workspace = true for all dependencies defined in the workspace Cargo.toml. Never duplicate versions.
Place all use statements at the top of the file or module. Never scatter imports throughout the code.
// Good: All imports at the top
use std::collections::HashMap;
use tokio::sync::RwLock;
use tracing::{info, warn};
async fn do_work() {
// ...
}
// Bad: Import in the middle of the code
async fn do_work() {
use std::collections::HashMap; // Don't do this
// ...
}Avoid repeating the same module path multiple times. Group related imports together.
// Good
use alien_core::{Platform, ResourceType, StackConfig};
// Bad
use alien_core::Platform;
use alien_core::ResourceType;
use alien_core::StackConfig;Always use camelCase for JSON serialization and document fields:
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct DeploymentConfig {
/// Unique identifier for the deployment
pub deployment_id: String,
/// Target environment (production, staging, etc.)
pub target_environment: String,
}Use the bon crate's builder pattern for functions or structs with more than a few parameters instead of long parameter lists.
| Purpose | Crate | Notes |
|---|---|---|
| Time & Date | chrono |
Never use time crate |
| Logging | tracing |
Use info!, warn!, error!, debug! macros |
| Serialization | serde |
Always with rename_all = "camelCase" |
| Async Runtime | tokio |
Our standard async runtime |
| Error Handling | alien-error |
Our custom error library (see below) |
Use Conventional Commits: <type>: <short summary>
Types: feat, fix, refactor, chore, docs, test, ci, perf
- Imperative mood, lowercase, no period, max ~72 chars
- Body (optional): explain why, not what; wrap at 72 chars
- Breaking changes: add
!after type (e.g.feat!: remove v1 API)
- Branch from
main; use<type>/<short-description>(e.g.feat/azure-containers,fix/duplicate-cleanup) - Keep branches short-lived; rebase on
mainbefore merging
- Never force-push to
main - Push feature branches and open a PR for review
- Squash-merge PRs into
main
We use our custom alien-error library instead of the popular alternatives because:
-
Structured metadata: Every error carries machine-readable
code,retryable,internal, andhttp_status_codefields—essential for API responses and automated retry logic. -
Error chaining with context: Like
anyhow, we support chaining errors, but each layer preserves structured metadata, not just string messages. -
API-ready: Errors serialize to JSON with consistent schema, and the
internalflag controls whether sensitive details are exposed to external clients. -
Inheritance: The
retryableandinternalflags can be inherited from source errors using"inherit", so wrapper errors don't need to guess.
There are two distinct aspects to working with errors:
- Designing errors - Defining error enums for your crate/module
- Using errors - Propagating and creating errors in your code
Each crate should have an error.rs file that defines its error types.
-
Name variants after the problem, not dependencies Variants should reflect domain-level issues, not internal implementation details or external libraries.
-
Embed actionable, self-contained data Include sufficient context directly in variants so callers don't need to downcast or guess. Almost always include a human-readable
messagefield. -
Provide useful context for recovery Include fields like paths, URLs, IDs, or parameters to assist in debugging and handling errors effectively.
-
Use a generic catch-all variant when needed If there are too many possible errors, provide a generic variant (e.g.,
OtherorInfrastructureError) for unexpected or uncommon cases. -
Always document each variant concisely Include brief, one-line doc comments to help maintainers understand each error.
-
Set appropriate metadata Use the
code,message,retryable,internal, and optionallyhttp_status_codeattributes.
Every variant must specify:
| Attribute | Description |
|---|---|
code |
Machine-readable identifier (e.g., "RESOURCE_NOT_FOUND") |
message |
Human-readable template with field interpolation |
retryable |
"true", "false", or "inherit" |
internal |
"true", "false", or "inherit" |
http_status_code |
(Optional) HTTP status code, defaults to 500 |
use alien_error::AlienErrorData;
use serde::{Deserialize, Serialize};
/// Errors related to deployment operations.
#[derive(Debug, Clone, AlienErrorData, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub enum ErrorData {
/// Deployment target configuration is invalid.
#[error(
code = "DEPLOYMENT_TARGET_INVALID",
message = "Deployment target '{target_id}' is invalid: {reason}",
retryable = "false",
internal = "false",
http_status_code = 400
)]
DeploymentTargetInvalid {
/// The target ID that failed validation
target_id: String,
/// Specific reason why the target is invalid
reason: String,
},
/// Cloud provider API call failed.
#[error(
code = "CLOUD_API_FAILED",
message = "Cloud API call failed: {message}",
retryable = "inherit", // Inherit from source error
internal = "inherit"
)]
CloudApiFailed {
/// Human-readable description of the failure
message: String,
/// The cloud provider involved
provider: String,
},
/// Resource was not found.
#[error(
code = "RESOURCE_NOT_FOUND",
message = "Resource '{resource_id}' not found",
retryable = "false",
internal = "false",
http_status_code = 404
)]
ResourceNotFound {
/// ID of the missing resource
resource_id: String,
},
/// Generic catch-all for uncommon errors.
#[error(
code = "DEPLOYMENT_ERROR",
message = "Deployment operation failed: {message}",
retryable = "true",
internal = "true"
)]
Other {
/// Human-readable description
message: String,
},
}
/// Convenient type alias for this module's Result type.
pub type Result<T> = alien_error::Result<T, ErrorData>;Use retryable = "inherit" or internal = "inherit" when your error wraps another error and should preserve its characteristics:
/// Wraps cloud platform errors without losing their retry/internal semantics.
#[error(
code = "CLOUD_PLATFORM_ERROR",
message = "Cloud platform operation failed: {message}",
retryable = "inherit",
internal = "inherit"
)]
CloudPlatformError {
message: String,
},When there's no source error to wrap:
use alien_error::AlienError;
fn validate_input(input: &str) -> Result<()> {
if input.is_empty() {
return Err(AlienError::new(ErrorData::InvalidInput {
message: "Input cannot be empty".to_string(),
field_name: Some("input".to_string()),
}));
}
Ok(())
}When propagating errors from other Alien crates, use .context():
use alien_error::Context;
async fn deploy_function(&self, config: &FunctionConfig) -> Result<()> {
self.cloud_client
.create_function(config)
.await
.context(ErrorData::CloudApiFailed {
message: format!("Failed to create function '{}'", config.name),
provider: "AWS".to_string(),
})?;
Ok(())
}For errors from non-Alien crates (std, tokio, serde, etc.), first convert with .into_alien_error():
use alien_error::{Context, IntoAlienError};
async fn read_config(path: &Path) -> Result<Config> {
let contents = tokio::fs::read_to_string(path)
.await
.into_alien_error() // Convert std::io::Error to AlienError
.context(ErrorData::FileOperationFailed {
operation: "read".to_string(),
file_path: path.display().to_string(),
reason: "Failed to read configuration file".to_string(),
})?;
serde_json::from_str(&contents)
.into_alien_error() // Convert serde_json::Error to AlienError
.context(ErrorData::ConfigParseError {
message: "Invalid JSON in configuration file".to_string(),
})
}Important: Only use
.into_alien_error()for third-party errors. Errors from Alien crates already implement the correct traits—just use.context()directly.
Don't repeat what the error template already says:
// Bad: Redundant message
.context(ErrorData::CloudApiFailed {
message: "Cloud API call failed".to_string(), // Template already says this!
provider: "AWS".to_string(),
})
// Good: Specific details
.context(ErrorData::CloudApiFailed {
message: "CreateFunction returned 403 Forbidden".to_string(),
provider: "AWS".to_string(),
})When a required condition cannot be met, return an error immediately.
// Bad: Silent fallback hides failures
fn get_config_value(key: &str) -> String {
match config.get(key) {
Some(value) => value.clone(),
None => {
warn!("Config key '{}' not found, using default", key);
"default".to_string() // Silently proceeding with wrong value
}
}
}
// Bad: Logging and continuing
async fn sync_resource(&self) -> Result<()> {
if let Err(e) = self.cloud_client.update().await {
warn!("Failed to update resource: {}", e);
// Continues execution as if nothing happened!
}
Ok(())
}// Good: Fail immediately when required value is missing
fn get_config_value(key: &str) -> Result<String> {
config.get(key)
.cloned()
.ok_or_else(|| AlienError::new(ErrorData::ConfigKeyMissing {
key: key.to_string(),
}))
}
// Good: Propagate errors properly
async fn sync_resource(&self) -> Result<()> {
self.cloud_client
.update()
.await
.context(ErrorData::SyncFailed {
message: "Failed to update resource".to_string(),
})?;
Ok(())
}Use warn! only for non-critical issues that don't affect correctness:
// Appropriate: Optional optimization failed, core functionality works
if let Err(e) = cache.invalidate(&key).await {
warn!("Failed to invalidate cache for '{}': {}. Proceeding without cache.", key, e);
}
// Core operation continues and will work correctly, just slowerDon't implement retry logic inside functions that can fail. Let the caller decide:
// Bad: Hidden retry logic
async fn fetch_data(&self) -> Result<Data> {
for attempt in 1..=3 {
match self.client.get().await {
Ok(data) => return Ok(data),
Err(e) => {
warn!("Attempt {} failed: {}", attempt, e);
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
}
Err(AlienError::new(ErrorData::FetchFailed { ... }))
}
// Good: Return error, let caller handle retries
async fn fetch_data(&self) -> Result<Data> {
self.client
.get()
.await
.context(ErrorData::FetchFailed { ... })
}The retry logic belongs at a higher level (e.g., the executor, job runner, or API handler) where retry policies can be configured consistently.
Tests must be strict. Assert as much as possible—the test should fail if the entire flow didn't pass.
// Bad: Using warn! without failing
#[tokio::test]
async fn test_deployment() {
let result = deploy().await;
if let Err(e) = result {
warn!("Deployment failed: {}", e); // Test passes even on failure!
}
}
// Bad: Conditional skipping
#[tokio::test]
async fn test_resource_creation() {
let resource = create_resource().await.unwrap();
if resource.status == "ready" { // What if it's not ready?
assert_eq!(resource.name, "expected");
}
// Test passes even if status wasn't "ready"!
}
// Bad: Swallowing errors silently
#[tokio::test]
async fn test_cleanup() {
let _ = delete_resource().await.ok(); // Silently ignores failures
// or
let result = operation().await.unwrap_or_default(); // Hides errors
}// Good: Assert everything
#[tokio::test]
async fn test_deployment() {
let result = deploy().await.expect("deployment should succeed");
assert_eq!(result.status, DeploymentStatus::Complete);
assert_eq!(result.resources.len(), 3);
assert!(result.url.starts_with("https://"));
}
// Good: Explicit assertions for all branches
#[tokio::test]
async fn test_resource_creation() {
let resource = create_resource().await.expect("resource creation should succeed");
assert_eq!(resource.status, "ready", "resource should be ready immediately");
assert_eq!(resource.name, "expected");
}
// Good: Test error cases explicitly
#[tokio::test]
async fn test_invalid_input_fails() {
let result = create_resource_with_invalid_input().await;
let error = result.expect_err("should fail with invalid input");
assert_eq!(error.code, "INVALID_INPUT");
}- Arrange: Set up test fixtures and mock services
- Act: Perform the operation being tested
- Assert: Verify ALL expected outcomes
#[tokio::test]
async fn test_function_deployment() {
// Arrange
let mock_provider = MockServiceProvider::new();
mock_provider.expect_create_function().returning(|_| Ok(created_function()));
let executor = SingleControllerExecutor::builder()
.resource(function_resource())
.controller(AwsFunctionController::default())
.service_provider(mock_provider)
.build()
.await
.expect("executor should build");
// Act
executor.run_until_terminal().await.expect("execution should complete");
// Assert
assert_eq!(executor.status(), ResourceStatus::Running);
assert!(executor.controller().function_arn.is_some());
assert!(executor.controller().role_arn.is_some());
}When a test fails or behavior is unexpected, find the real root cause. Don't patch symptoms.
// Bad: Workaround that hides the real issue
#[tokio::test]
async fn test_sync() {
let result = sync().await;
// "Sometimes this fails, just retry"
let result = if result.is_err() { sync().await } else { result };
assert!(result.is_ok());
}
// Good: Investigate why it fails sometimes
// - Is there a race condition?
// - Is the mock not set up correctly?
// - Is the timeout too short?
// Fix the actual issue, don't hide it.- Reproduce consistently: If a test is flaky, make it fail reliably first
- Check assumptions: Verify mocks are configured correctly
- Add logging: Use
tracingto understand the flow - Isolate the issue: Create a minimal reproduction
- Fix the root cause: Don't add retries or sleeps as fixes