Skip to content

Implement Shifter Job for WAL to Iceberg Data Movement #29

@s-prosvirnin

Description

@s-prosvirnin

We need to implement a specific job type called Shifter, which is responsible for moving data from the Write-Ahead Log (WAL) to Iceberg tables.

This job will run within the JobManager infrastructure (defined in issue #28) and is critical for the system's data ingestion pipeline.

Key Objectives:

  1. Job Implementation:

    • Create a new job definition (e.g., with code SHIFTER) that utilizes the existing JobDefinition structure.
    • Implement the core logic to read batched data from the WAL (S3/storage).
    • Transform and write this data into the target Iceberg tables.
  2. Task Executors:

    • Define specific TaskExecutorFns required for the Shifter workflow (e.g., read_wal_segment, write_to_iceberg, commit_transaction).
    • Ensure tasks are idempotent where possible to handle retries gracefully.
  3. Integration:

    • Register the Shifter job with the JobRegistry so it can be scheduled by the JobsManager.
    • Ensure configuration options are available (e.g., batch sizes, target table configurations).

Technical Context:

  • This implementation will likely reside in a new module or crate (e.g., icegate-shifter or part of icegate-ingest) but must implement the traits defined in icegate-jobmanager.
  • It should leverage the icegate-common schemas and icegate-storage abstractions.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions