-
Notifications
You must be signed in to change notification settings - Fork 46
Add support for "benchmarking scenarios" #99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
81c35db
to
91651b0
Compare
@sjmonson I like the direction of this. A few quick thoughts from my side:
Not sure if there's something out there already to automatically handle the last two points, but if so, that would be a great inclusion |
My plan was to rely on the base class json and yaml loaders since they are cleaner for nested structures, such as synthetic dataset args. I can definitely try pydantic-settings since that does have the advantage of allowing us to unify all GuideLLM options under one file.
By default pydantic will attempt type coercion during the validation so its probably as simple as disabling the type handling in click and raising a
They already do in this patch as I have set every click default to pull from the scenario model, unless you mean something else? |
@sjmonson, for the last one, yes, something else. Specifically the entrypoint for benchmarking Python API here: https://github.com/neuralmagic/guidellm/blob/main/src/guidellm/benchmark/entrypoints.py#L22. Towards @anmarques's previous issues with needing to set all argument values when not using the CLI. Also would help towards suporting both CLI and Python API for the scenarios |
Ah I see. The workflow I imagined for @anmarques's use-case is to call the higher-level entrypoint result = await benchmark_with_scenario(
GenerativeTextScenario(
target="http://localhost:8000",
data={
"prompt_tokens": 128,
"output_tokens": 128,
},
rate_type="sweep",
),
output_path="output.json",
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds support for benchmarking scenarios by introducing a unified way to specify benchmark arguments via Pydantic objects and JSON/YAML scenario files, ensuring that CLI and code-based defaults remain consistent.
- Introduces utility functions for JSON parsing and default option handling in CLI
- Adds Pydantic helper methods to load configuration from files
- Implements scenario-based configuration with new benchmark scenario files and updated entrypoints
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
src/guidellm/utils/cli.py | Added JSON parsing helper, default-setting function, and custom union parameter type |
src/guidellm/objects/pydantic.py | Added get_default and from_file methods to streamline model instantiation |
src/guidellm/benchmark/scenarios/rag.json | New benchmark scenario configuration in JSON |
src/guidellm/benchmark/scenarios/chat.json | New benchmark scenario configuration in JSON |
src/guidellm/benchmark/scenario.py | Introduced Scenario and GenerativeTextScenario classes along with a float-list parser |
src/guidellm/benchmark/entrypoints.py | Added benchmark_with_scenario to enable scenario-driven execution |
src/guidellm/main.py | Updated CLI to support scenario files and default values from scenario models |
Comments suppressed due to low confidence (2)
src/guidellm/utils/cli.py:29
- [nitpick] The class name 'Union' may be confused with the built-in union type operator used in type hints. Consider renaming it to a more explicit name (e.g. 'UnionParamType') for clarity.
class Union(click.ParamType):
src/guidellm/benchmark/scenario.py:28
- [nitpick] The docstring for parse_float_list could be improved by mentioning that any whitespace around comma-separated numbers will not be trimmed automatically. Consider adding a note or trimming whitespace before conversion.
def parse_float_list(value: Union[str, float, list[float]]) -> list[float]:
This PR adds support for "scenarios" that allow specifying benchmark argument in a file / as a single Pydantic object. CLI argument defaults are loaded from the scenario object defaults to give benchmark-as-code users the same defaults as CLI. Argument values in the CLI follow the following precedence:
Scenario (class defaults) < Scenario (CLI provided Scenario) < CLI Arguments
.Closes: