Summary
Implement a scalable target-aware uniqueness check for append mode so duplicates can be detected against existing accepted data, not only within the incoming batch.
Context
Write mode append/overwrite exists. Unique checks currently need a robust architecture for target-side comparisons at scale.
Scope
- Define abstraction/hook for target uniqueness lookup by sink format (Parquet/Delta/Iceberg)
- Optimize key-only reads and partition pruning where possible
- Keep overwrite mode behavior unchanged
- Report duplicate source-vs-target outcomes clearly
Acceptance criteria
- Design/spec + incremental implementation plan
- Initial implementation for at least one sink (future PRs)
Related: #159