MLWM v1 Architecture

MLWM v1 adds a neural short-payload image watermark engine next to the legacy DCT engine.

Core design

Payload is fixed to 256 bits after framing, RS encoding, and deterministic bit whitening.
Text payload is limited to 16 UTF-8 bytes.
Legacy frequency-domain synchronization remains the geometric anchor.
Neural embed/extract only handles payload recovery.
Runtime uses ONNX for desktop inference and PyTorch for training/export.

Encode short text into a fixed frame with CRC32 and RS parity, then whiten the 256 training bits.
Run the encoder network to predict a residual map.
Apply the residual to the image and inject the classical sync template.
At extraction time, detect the sync template, rectify the image, and score several candidate views.
Aggregate decoder logits, unwhiten the bits, RS-decode, and CRC-check the payload.
If neural decode fails in auto, fall back to the legacy engine.