benchmark serving: random + sharegpt dataset #14026

seungrokj · 2025-02-28T08:14:17Z

Adding random inputs from sharegpt dataset in the "--dataset-name random" from benchmark_serving.py

Background: when comparing online serving performance against sglang. We witnessed the random inputs are different from vllm client vs sglang client and this gives some perf gap in certain models. To narrow this gap, adding new feature like this:

Usage:

random inputs (default)

python3 vllm/benchmarks/benchmark_serving.py
--backend vllm
--dataset-name random \

random inputs from sharegpt dataset (added feature)

python3 vllm/benchmarks/benchmark_serving.py
--backend vllm
--dataset-name random
--dataset-path ShareGPT_V3_unfiltered_cleaned_split.json \

Signed-off-by: seungrokj <[email protected]>

github-actions · 2025-02-28T08:14:28Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

seungrokj · 2025-02-28T08:25:34Z

reopened here again:
@andylolu2 @billishyahao @haic0

comaniac · 2025-02-28T18:09:09Z

Thanks for the PR. While the feature makes sense, I feel the updated CLI is a bit confusing. Instead of adding the feature of "sampling from ShareGPT" to "random", why don't we enhance the existing sharegpt mode? Intuitively in this case we are still benchmarking ShareGPT, after all.

ywang96 · 2025-03-01T23:20:33Z

We witnessed the random inputs are different from vllm client vs sglang client and this gives some perf gap in certain models.

Hello @seungrokj! Could you clarify a bit more on this? You can use the same client code to benchmark against both vLLM and SGLang, so at least on the input side there shouldn't be any difference.

pre-commit cleanup

6695a8a

Signed-off-by: seungrokj <[email protected]>

DarkLight1337 requested review from WoosukKwon, ywang96 and comaniac and removed request for WoosukKwon February 28, 2025 16:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark serving: random + sharegpt dataset #14026

benchmark serving: random + sharegpt dataset #14026

seungrokj commented Feb 28, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Feb 28, 2025

seungrokj commented Feb 28, 2025

comaniac commented Feb 28, 2025

ywang96 commented Mar 1, 2025

benchmark serving: random + sharegpt dataset #14026

Are you sure you want to change the base?

benchmark serving: random + sharegpt dataset #14026

Conversation

seungrokj commented Feb 28, 2025 • edited by github-actions bot Loading

github-actions bot commented Feb 28, 2025

seungrokj commented Feb 28, 2025

comaniac commented Feb 28, 2025

ywang96 commented Mar 1, 2025

seungrokj commented Feb 28, 2025 •

edited by github-actions bot

Loading