Investigate whether the OPD teacher endpoint can return per-token logprobs, which are required for training (e.g. distillation / KL-style objectives against the teacher distribution).
Scope
- Probe the current OPD teacher inference path to see if logprobs are exposed in responses.
- If supported: document the request shape, top-k limits, and any perf implications of enabling logprobs.
- If unsupported: identify what changes are needed in the teacher serving stack to surface them.
- Confirm the orchestrator client can plumb logprobs through to the trainer.
Investigate whether the OPD teacher endpoint can return per-token logprobs, which are required for training (e.g. distillation / KL-style objectives against the teacher distribution).
Scope