Releases: neuralmagic/guidellm
Releases · neuralmagic/guidellm
GuideLLM v0.2.1
Summary
- Bug fixes enabling HF datasets and local data files for benchmarking that were resulting in crashes due to improper calls into datasets load_dataset function
- Refactored CI/CD system based on the latest standards for releases
What's Changed
- Update version on main to 0.3.0 to begin work on the next release by @markurtz in #127
- Fix python versions for display in README.md by @markurtz in #128
- Fix logging by @hhy3 in #129
- Data and request fixes for real data / chat_completions pathways by @markurtz in #131
- Fix argument error in nightly unit tests by @sjmonson in #132
- Add docs for data/datasets and how to configure them in GuideLLM by @markurtz in #137
- Refactor CI/CD system based on latest standardization for upstreams by @markurtz in #135
New Contributors
Full Changelog: v0.2.0...v0.2.1
GuideLLM v0.2.0
Summary
- Minimal Execution Overheads
- Refactor enabling async multi-process/threaded design with just 0.16% overhead in synchronous and 99.9% accuracy for constant requests
- Robust Accuracy + Monitoring
- Built-in timings and diagnostics added to validate performance and catch regressions
- Flexible Benchmarking Profiles
- Prebuilt support for synchronous, concurrent (added), throughput, constant rate, poisson rate, and sweep modes
- Unified Input/Output Formats
- JSON, YAML, CSV, and console output now standardized
- Multi-Use Data Loaders
- Native support for HuggingFace datasets, file-based data, and synthetic samples with fixes for previous flows and expanded support
- Pluggable Backends via OpenAI-Compatible APIs
- Redeisgned to work out of the box with OpenAI style HTTP servers, easily expandable to other interfaces and servers. Fixed issues related to improper token lengths and more
What's Changed
- Add summary metrics to saved json file by @anmarques in #46
- ADD TGI docs by @philschmid in #43
- Add missing vllm docs link by @eldarkurtic in #50
- Change default "role" from "system" to "user" by @philschmid in #53
- FIX TGI example by @philschmid in #51
- Revert Summary Metrics and Expand Test Coverage to Stabilize Nightly/Main CI by @markurtz in #58
- [Dataset]: Iterate through benchmark dataset once by @parfeniukink in #48
- Replace busy wait in async loop with a Semaphore by @sjmonson in #80
- Add backend_kwargs to generate_benchmark_report by @jackcook in #78
- Drop request count check from throughput sweep profile by @sjmonson in #89
- Rework Backend to Native HTTP Requests and Enhance API Compatibility & Performance by @markurtz in #91
- Multi Process Scheduler Implementation, Benchmarker, and Report Generation Refactor by @markurtz in #96
- Update the README by @sjmonson in #112
- Fix units for Req Latency in output to seconds by @smalleni in #113
- Fix/non integer rates by @thameem-abbas in #116
- Output support expansion, code hygiene, and tests by @markurtz in #117
- Bump min python to 3.9 by @sjmonson in #121
- v0.2.0 Version Update and Docs Expansions by @markurtz in #118
- Fix issue if async task count does not evenly divide accross process pool by @sjmonson in #120
- Readme grammar updates and cleanup by @markurtz in #124
- Update CICD flows to enable automated releases and match the feature set laid out in #56 by @markurtz in #125
- CI/CD Build Fixes for Release by @markurtz in #126
New Contributors
- @anmarques made their first contribution in #46
- @philschmid made their first contribution in #43
- @eldarkurtic made their first contribution in #50
- @sjmonson made their first contribution in #80
- @jackcook made their first contribution in #78
- @smalleni made their first contribution in #113
- @thameem-abbas made their first contribution in #116
Full Changelog: v0.1.0...v0.2.0
GuideLLM v0.1.0
What's Changed
Initial release of GuideLLM with version 0.1.0! This core release adds the basic structure, infrastructure, and code for benchmarking LLM deployments across several different use cases utilizing a CLI interface and terminal output. Further improvements are coming soon!
- Support added for general OpenAI backends and any text-input-based model served through those
- Support added for emulated, transformers, and file-based datasets
- Support added for general file storage of the full benchmark/evaluation that was run
- Full support for different benchmark types including sweeps, synchronous, throughput, constant, and poison enabled through new scheduler and executor interfaces built on top of Python's asyncio
New Contributors
- @DaltheCow made their first contribution in #4
- @markurtz made their first contribution in #3
- @rgreenberg1 made their first contribution in #21
- @jennyyangyi-magic made their first contribution in #35
Full Changelog: https://github.com/neuralmagic/guidellm/commits/v0.1.0