This page is the entry point to Floe documentation. It groups the most important references so you can quickly find the right guide.
- Install and run: see the repository README for a minimal example.
- Guided onboarding and examples: GitHub Wiki.
- CLI usage: docs/cli.md
- Installation options (Homebrew, Cargo, Docker): docs/installation.md
- How a run works (file → row → entity checks): docs/how-it-works.md
- Checks and policy behavior: docs/checks.md
- Run reports and JSON schema: docs/report.md
- Logging for orchestrators (
--log-format): docs/logging.md
- CLI manifest generation (
floe manifest generate): docs/cli.md - Common orchestrator manifest schema: orchestrators/schemas/floe.manifest.v1.json
- Dagster integration package + examples: orchestrators/dagster-floe/README.md
- Airflow integration package + examples: orchestrators/airflow-floe/README.md
- Local integration dev notes: orchestrators/LOCAL_DEV.md
- Full config reference (all keys + defaults): docs/config.md
- Storage registry and cloud paths:
- S3: docs/storages/s3.md
- ADLS: docs/storages/adls.md
- GCS: docs/storages/gcs.md
- Supported inputs/outputs across storages: docs/support-matrix.md
- Parquet/Delta sink details:
- Delta sink: docs/sinks/delta.md
- Sink options (parquet settings): docs/sinks/options.md
- Iceberg (filesystem catalog on local/S3/GCS, plus AWS Glue catalog on S3): docs/sinks/iceberg.md
- CLI usage (
validate,run,manifest generate,add-entity): docs/cli.md floe add-entitycan bootstrap a missing config file and infer entity name/format from input path extensions (CSV/JSON/Parquet).
- Bench setup and results: docs/benchmarking.md
- Floe supports partitioned Delta (
partition_by) and partitioned Iceberg (partition_spec) writes. - Floe reports write-time metrics/metadata for accepted outputs (format-dependent), including Delta/Iceberg table versioning metadata and file sizing metrics where available.
- Floe does not perform table optimization/maintenance (for example Delta optimize/vacuum or Iceberg compaction/maintenance); run those as separate jobs.
- Iceberg schema evolution and merge/upsert workflows are out of scope in current releases.
- Target-aware uniqueness checks against existing sink tables in append mode are future work (current uniqueness checks operate within Floe's processed data scope for the run).
If you are missing a document or a section feels out of date, please open an issue or PR so we can keep this summary aligned with current behavior.