Skip to content

feat: make max_logprobs configurable for inference engines#1357

Open
penfever wants to merge 1 commit intoNovaSky-AI:mainfrom
penfever:penfever/configurable-max-logprobs
Open

feat: make max_logprobs configurable for inference engines#1357
penfever wants to merge 1 commit intoNovaSky-AI:mainfrom
penfever:penfever/configurable-max-logprobs

Conversation

@penfever
Copy link
Copy Markdown

@penfever penfever commented Mar 20, 2026

Summary

  • The max_logprobs parameter in create_ray_wrapped_inference_engines() was hardcoded to 1 (chosen-token only)
  • Makes it configurable so callers like teacher distillation can request top-K logprobs from vLLM
  • Default remains 1 for backward compatibility — no behavior change for existing users

Test plan

  • Verify default behavior unchanged (max_logprobs=1)
  • Verify passing max_logprobs>1 propagates to vLLM engine

🤖 Generated with Claude Code


Open with Devin

The max_logprobs parameter was hardcoded to 1 (chosen-token only).
Make it configurable via create_ray_wrapped_inference_engines() so
callers like teacher distillation can request top-K logprobs from
vLLM. Default remains 1 for backward compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 1 additional finding.

Open in Devin Review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the max_logprobs parameter configurable for inference engines, which is a necessary step for supporting top-K logprobs. The changes in ray_wrapped_inference_engine.py are implemented correctly. For the feature to be fully functional, other parts of the codebase not included in this PR might need adjustments. For example, the _postprocess_outputs method in vllm_engine.py seems to only extract the logprob of the chosen token, and a validation in utils.py restricts the default logprobs configuration. These are outside the scope of this PR's changes but are worth considering for the feature to work end-to-end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant