An Observability Data Lake engine designed to be fast, easy-to-use, cost-effective, scalable, and fault-tolerant.
IceGate supports open protocols and APIs compatible with standard ingesting and querying tools. All data is persisted in Object Storage, including WAL for ingested data, catalog metadata, and the data layer.
- Highly Scalable: Scale compute resources independently based on workload demands of specific components
- ACID Transactions: Full transaction support without requiring a dedicated OLTP database
- Exactly-Once Delivery: Reliable data ingestion with no data loss or duplication
- Real-Time Queries: Access live data through WAL while maintaining historical query capabilities
- Open Standards: Built on Apache Iceberg, Arrow, and Parquet with OpenTelemetry protocol support
- Cost-Effective: Object storage-based architecture minimizes infrastructure costs
- Fault-Tolerant: Designed for resilience and high availability
IceGate employs a compute-storage separation architecture, allowing independent scaling of processing and storage resources. This design enables cost-effective scaling where compute resources (Ingest, Query, Maintain, Alert) can be scaled independently based on workload demands, while all data resides in object storage.
The system consists of five core components for handling observability data (metrics, traces, logs, and events):
- Technology: Apache Iceberg
- Purpose: Organizes the data lake with ACID transaction support
- Key Feature: Custom catalog implementation that doesn't require a dedicated OLTP database while still supporting transactions
- Protocol: OpenTelemetry
- Purpose: Accept observability data and persist it in Object Storage
- Implementation: WAL using Parquet files organized in a special way to be compatible with the Storage data layer
- Delivery Guarantee: Exactly-once delivery
- Note: WAL files can be used by the Query layer to provide real-time data access
- Technology: Apache DataFusion and Apache Arrow
- Purpose: Query engine for processing logs, metrics, traces, and events
- Implementation: Rust-native query engine built on Apache Arrow, providing a foundation to build query engines using various protocols
- Data Format: Apache Parquet
- Features:
- Statistics and bloom filters for efficient querying
- Additional custom statistics and filters for optimization
- Data optimization operations: merge, TTL support, manifest optimization, and orphan resource cleanup
- Purpose: Maximize query efficiency and maintain data lake health
- Purpose: Provides management of alerting rules, analyzing observability data, and generating alert events
- Features:
- Rule management for defining alert conditions
- Real-time analysis of observability data (logs, metrics, traces)
- Event generation and delivery based on rule evaluation
- Data Type: Events are treated as a dedicated data type alongside logs, metrics, and traces
- Convention: Follows OpenTelemetry Events Semantic Conventions
- Implementation: Leverages the Query layer for querying observability data to evaluate alert rules
- Rust >= 1.92.0 (for Rust 2024 edition support)
- Cargo (Rust's package manager and build tool, included with Rust)
- Git
- rustfmt - for code formatting (included with Rust)
- clippy - for linting and static analysis (included with Rust)
- rust-analyzer - for IDE support
Check if Rust is installed with the correct version:
# Check Rust version
rustc --version
# Check Cargo version
cargo --version
# Check rustfmt (optional)
rustfmt --version
# Check clippy (optional)
cargo clippy --versionYou should have Rust 1.92.0 or later installed.
If you don't have Rust installed, use rustup (the recommended Rust toolchain installer):
# Install Rust via rustup
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Follow the prompts to complete installation
# Then reload your shell or run:
source $HOME/.cargo/env
# Verify installation
rustc --version
cargo --versionDependencies are managed via Cargo and specified in Cargo.toml. They will be automatically downloaded and compiled when you build the project.
cargo build # Build the project
cargo test # Run tests
make dev # Start full development stack with DockerFor detailed development setup, build commands, and code quality guidelines, see CONTRIBUTING.md.
Contributions are welcome! See CONTRIBUTING.md for guidelines.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Built with:
- Apache Iceberg - Table format for data lakes
- Apache Arrow - Columnar memory format
- Apache Parquet - Columnar storage format
- OpenTelemetry - Observability framework
IceGate is currently in prototype development. APIs and features are subject to change.