[Feature] Add encoder time #64

LJH-LBJ · 2025-12-22T12:10:06Z

Description

Resolve: #28

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Performance improvement
Code refactoring
Test improvements
CI/CD improvements

Related Issues

Changes Made

获取enable_metrics参数
对encoder执行计时，存储在metrics
返回metrics

Testing

非流式

curl -X POST  http://127.0.0.1:5580/v1/chat/completions     -H "Content-Type: application/json"     -d '{
    "model": "/workspace/models/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "file:///workspace/l00807937/EPD_Timecount_v0.11.0/image/work.jpg"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ],
    "enable_metrics": {
      "encode": true
    }
    }'

返回值

[root@devserver-bms-163 llm-service]# curl -X POST  http://127.0.0.1:5580/v1/chat/completions     -H "Content-Type: application/json"     -d '{
    "model": "/workspace/models/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "file:///workspace/l00807937/EPD_Timecount_v0.11.0/image/work.jpg"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ],
    "enable_metrics": {
      "encode": true
    }
    }'
{"id":"chatcmpl-bb32e92d9eb045ea99a1870bba9665cd","object":"chat.completion","created":1766392413,"model":"/workspace/models/Qwen2.5-VL-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","content":"The text in the image is in Chinese and reads: \"我的工作永远都做不完的\" which translates to \"My work will never be finished.\"","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":"None","token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":216,"total_tokens":248,"completion_tokens":32,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null,"metrics":{"encode_time_ms":56}}

流式

curl -X POST  http://127.0.0.1:5580/v1/chat/completions     -H "Content-Type: application/json"     -d '{
    "model": "/workspace/models/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "file:///workspace/l00807937/EPD_Timecount_v0.11.0/image/work.jpg"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ],
    "enable_metrics": {
      "encode": true
    }, "stream": true
    }'

返回值

[root@devserver-bms-163 llm-service]# curl -X POST  http://127.0.0.1:5580/v1/chat/completions     -H "Content-Type: application/json"     -d '{
    "model": "/workspace/models/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "file:///workspace/l00807937/EPD_Timecount_v0.11.0/image/work.jpg"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ],
    "enable_metrics": {
      "encode": true
    }, "stream": true
    }'
data: {"id":"chatcmpl-dd73b1841fc145abb388e2f0ab9ec3d1","object":"chat.completion.chunk","created":1766392398,"model":"/workspace/models/Qwen2.5-VL-7B-Instruct","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"prompt_token_ids":null,"metrics":{"encode_time_ms":70}}

data: {"id":"chatcmpl-dd73b1841fc145abb388e2f0ab9ec3d1","object":"chat.completion.chunk","created":1766392398,"model":"/workspace/models/Qwen2.5-VL-7B-Instruct","choices":[{"index":0,"delta":{"content":"The"},"logprobs":null,"finish_reason":null,"token_ids":null}]}

....

data: {"id":"chatcmpl-dd73b1841fc145abb388e2f0ab9ec3d1","object":"chat.completion.chunk","created":1766392398,"model":"/workspace/models/Qwen2.5-VL-7B-Instruct","choices":[{"index":0,"delta":{"content":""},"logprobs":null,"finish_reason":"stop","stop_reason":"None","token_ids":null}]}

data: [DONE]

纯文本

[root@devserver-bms-163 llm-service]#curl -X POST http://127.0.0.1:5580/v1/chat/completions   -H "Content-Type: application/json"   -d '{
    "model": "/workspace/models/Qwen2.5-VL-7B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the text in the illustrate?"}
    ],
    "enable_metrics": {
      "encode": true
    }
  }'

返回值

{"id":"chatcmpl-13c0beac94a1494abaaafd304bd07fbf","object":"chat.completion","created":1766480757,"model":"/workspace/models/Qwen2.5-VL-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","content":"I'm sorry, but you haven't provided an image or any text for me to describe. Could you please upload an image or provide the text directly so I can assist you better?","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":"None","token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":27,"total_tokens":65,"completion_tokens":38,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null,"metrics":{"encode_time_ms":0}}[root@devserver-bms-163 llm-service]#

离线测试无问题

Existing tests pass
New tests added (if applicable)
Manual testing performed

Test Coverage

Documentation

Documentation updated (if needed)
Code comments added/updated
API documentation updated (if applicable)

Checklist

Screenshots/Output

Additional Notes

Reviewer Checklist

Signed-off-by: Junhong <[email protected]>

…vice into add_encoder_time

Signed-off-by: Junhong <[email protected]>

Copilot

Pull request overview

This PR adds encoder execution timing functionality to track and report the time taken for encoding multimodal data in requests. When enable_metrics["encode"] is set to true in the request, the system now captures and returns the encoding time in milliseconds.

Key Changes:

Added enable_metrics parameter support to capture encoder timing metrics
Implemented encoder execution time tracking and reporting in both streaming and non-streaming responses
Extended protocol classes to support metrics collection and propagation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
lm_service/protocol/protocol.py	Added `capture_metrics_result` field to `GenerationResponse` for storing metrics data and `enable_metrics` field to `GenerationRequest` to control which metrics to capture
lm_service/apis/vllm/proxy.py	Implemented encoder timing logic by extracting `enable_metrics` from prompt, calculating encode time, and adding helper functions `metrics_enabled()` and `cal_exec_time()` to support metrics functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lm_service/apis/vllm/proxy.py

Signed-off-by: Junhong <[email protected]>

Co-authored-by: Copilot <[email protected]> Signed-off-by: Junhong Liu <[email protected]>

Signed-off-by: Junhong <[email protected]>

LJH-LBJ added 8 commits December 19, 2025 16:46

add_enable_metrics

71441c3

Signed-off-by: Junhong <[email protected]>

fix

9287c5d

Signed-off-by: Junhong <[email protected]>

fix

fd4774e

Signed-off-by: Junhong <[email protected]>

fix bug

72c7fad

Signed-off-by: Junhong <[email protected]>

Update proxy.py

bb11237

Signed-off-by: Junhong <[email protected]>

Update proxy.py

bcaee8e

Signed-off-by: Junhong <[email protected]>

Update proxy.py

a863b8b

Signed-off-by: Junhong <[email protected]>

Merge branch 'JiusiServe:main' into add_encoder_time

52907d5

github-actions bot added core api vllm protocol labels Dec 22, 2025

LJH-LBJ added 5 commits December 22, 2025 20:17

fix pre-commit

4cd9cfe

Signed-off-by: Junhong <[email protected]>

Merge branch 'add_encoder_time' of https://github.com/LJH-LBJ/llm-ser…

724ac0d

…vice into add_encoder_time

fix

954a549

Signed-off-by: Junhong <[email protected]>

fix pre-commit

9dc1329

Signed-off-by: Junhong <[email protected]>

fix pre-commit

665d3ff

Signed-off-by: Junhong <[email protected]>

wuhang2014 requested a review from Copilot December 23, 2025 06:34

Copilot started reviewing on behalf of wuhang2014 December 23, 2025 06:35 View session

Copilot AI reviewed Dec 23, 2025

View reviewed changes

lm_service/apis/vllm/proxy.py Show resolved Hide resolved

lm_service/apis/vllm/proxy.py Show resolved Hide resolved

lm_service/apis/vllm/proxy.py Show resolved Hide resolved

lm_service/apis/vllm/proxy.py Show resolved Hide resolved

LJH-LBJ changed the title ~~Add encoder time~~ [Feature] Add encoder time Dec 23, 2025

fix pure text

5c808fa

Signed-off-by: Junhong <[email protected]>

github-actions bot added the workers label Dec 23, 2025

LJH-LBJ and others added 4 commits December 23, 2025 17:33

Update lm_service/apis/vllm/proxy.py

8bb0847

Co-authored-by: Copilot <[email protected]> Signed-off-by: Junhong Liu <[email protected]>

Update proxy.py

bb637db

Signed-off-by: Junhong <[email protected]>

Update proxy.py

ca3e557

Signed-off-by: Junhong <[email protected]>

Update disagg_worker.py

abbf2ff

Signed-off-by: Junhong <[email protected]>

wuhang2014 merged commit a295f06 into JiusiServe:main Dec 25, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Add encoder time #64

[Feature] Add encoder time #64

Uh oh!

LJH-LBJ commented Dec 22, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Feature] Add encoder time #64

[Feature] Add encoder time #64

Uh oh!

Conversation

LJH-LBJ commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Changes Made

Testing

Test Coverage

Documentation

Checklist

Screenshots/Output

Additional Notes

Reviewer Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LJH-LBJ commented Dec 22, 2025 •

edited

Loading