add kvcache aware plugin e2e test by FAUST-BENCHOU · Pull Request #1222 · volcano-sh/kthena

FAUST-BENCHOU · 2026-06-16T05:00:45Z

What type of PR is this?

What this PR does / why we need it:

After #1199 and #1178 only kvcache-aware plugin left now

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Signed-off-by: zhoujinyu <2319109590@qq.com>

volcano-sh-bot · 2026-06-16T05:00:51Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign yaozengzeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

test/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds an end-to-end KV-cache-aware scheduler plugin test chain by introducing a ZMQ bridge and wiring the mock deployment to publish/consume KV events through runtime + Redis.

Changes:

Added a Python ZMQ bridge helper and created it as a ConfigMap during E2E setup.
Updated the mock deployment to include zmq-bridge and runtime containers and enabled the sim’s native ZMQ publishing.
Added a new KV-cache-aware E2E test plus Redis/rollout helper functions.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
test/e2e/router/router-plugins/testdata/zmq-bridge.py	New helper script to forward sim ZMQ events into the runtime topic layout
test/e2e/router/router-plugins/testdata/LLM-Mock-plugins.yaml	Adds bridge + runtime sidecars and config/ports to enable full KV-cache-aware chain
test/e2e/router/router-plugins/context/context.go	Creates ConfigMap containing the bridge script before deploying mocks
test/e2e/router/plugins_test.go	Adds `TestSchedulerPluginKVCacheAware` full-chain routing preference test
test/e2e/router/plugins_helpers.go	Adds Redis port-forward client setup and Redis-wait helpers for KV mappings

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

gemini-code-assist

Code Review

This pull request introduces end-to-end testing for the kvcache-aware scheduling plugin, adding a ZMQ bridge helper script, configuring a runtime container in the mock LLM deployment, and implementing the TestSchedulerPluginKVCacheAware test. The review feedback focuses on improving test robustness and preventing flakiness. Key recommendations include refactoring setupRedisClient to return errors instead of asserting inside Eventually, resolving the zmq-bridge.py path dynamically using runtime.Caller to support running tests from package subdirectories, avoiding redundant rolling updates in patchMockRedisHost when the host is already set, and adding a buffer to the log scraping start time to account for potential clock skew.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-16T05:02:51Z

+func setupRedisClient(t *testing.T, kube kubernetes.Interface, namespace string) (*redis.Client, func()) {
+	t.Helper()
+	pods := utils.ListReadyPodsByLabel(t, kube, namespace, redisServerAppLabel)
+	require.NotEmpty(t, pods, "no ready redis pods in namespace %s", namespace)
+
+	localPort := utils.AllocateLocalPort(t)
+	pf, err := utils.SetupPortForwardToPod(namespace, pods[0].Name, localPort, "6379")
+	require.NoError(t, err, "port-forward to redis pod %s", pods[0].Name)
+
+	addr := fmt.Sprintf("127.0.0.1:%s", localPort)
+	client := redis.NewClient(&redis.Options{Addr: addr})
+	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+	defer cancel()
+	require.NoError(t, client.Ping(ctx).Err(), "redis ping via port-forward")
+
+	return client, func() {
+		_ = client.Close()
+		pf.Close()
+	}
+}


Using require assertions (which call t.FailNow()) inside a helper function that is executed within require.Eventually defeats the purpose of retrying. If any transient error occurs (such as a temporary port-forwarding failure or a Redis ping timeout while the container is starting up), the test will fail immediately instead of waiting and retrying. Refactor setupRedisClient to return an error so that the caller can handle it gracefully inside Eventually.

func setupRedisClient(t *testing.T, kube kubernetes.Interface, namespace string) (*redis.Client, func(), error) { t.Helper() pods := utils.ListReadyPodsByLabel(t, kube, namespace, redisServerAppLabel) if len(pods) == 0 { return nil, nil, fmt.Errorf("no ready redis pods in namespace %s", namespace) } localPort := utils.AllocateLocalPort(t) pf, err := utils.SetupPortForwardToPod(namespace, pods[0].Name, localPort, "6379") if err != nil { return nil, nil, fmt.Errorf("port-forward to redis pod %s: %w", pods[0].Name, err) } addr := fmt.Sprintf("127.0.0.1:%s", localPort) client := redis.NewClient(&redis.Options{Addr: addr}) ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) defer cancel() if err := client.Ping(ctx).Err(); err != nil { _ = client.Close() pf.Close() return nil, nil, fmt.Errorf("redis ping via port-forward: %w", err) } return client, func() { _ = client.Close() pf.Close() }, nil }

Signed-off-by: zhoujinyu <2319109590@qq.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

Signed-off-by: zhoujinyu <2319109590@qq.com>

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

FAUST-BENCHOU · 2026-06-16T12:42:56Z

Just a compatibility. The ZMQ topic sent by the sim (sim instance) is inconsistent with the topic listened to by the runtime: the sim uses a per-pod topic like kv@<podIP>@<model>, while the runtime only subscribes to kv-events. Without conversion, it won't receive any kv-cache events, and Redis won't be written.

Example: Before processing, the sim sends a message with the topic kv@10.0.1.23@llm-mock, and the runtime, subscribing to kv-events, doesn't consume it at all. After processing, zmq_bridge changes the same message to kv-events and forwards it to the runtime (payload unchanged), allowing the runtime to write to Redis.

FAUST-BENCHOU · 2026-06-16T12:53:02Z


 // SetupPluginComponents deploys fast/slow plugin mocks and ModelServers shared by plugin e2e tests.
-func SetupPluginComponents(kubeClient *kubernetes.Clientset, kthenaClient *clientset.Clientset, namespace string) error {
+func SetupPluginComponents(kubeClient *kubernetes.Clientset, kthenaClient *clientset.Clientset, namespace, kthenaNamespace string) error {


Write REDIS_HOST into the ConfigMap before deploying SetupPluginComponents . I tried patching mid-deployment, but the patch triggered Pod restarts, interrupted warmup and kv cache writes.

FAUST-BENCHOU · 2026-06-16T12:56:08Z


 // DirectChatToPod sends count streaming chat requests directly to a pod via port-forward.
-func DirectChatToPod(t *testing.T, pod corev1.Pod, model, prompt string, count int) {
+func DirectChatToPod(t *testing.T, pod corev1.Pod, model, prompt string, count, maxTokens int) {


In the kvcache-aware chain, the kv-cache capacity/block count of sim is very small. During testing, I found that if chat requests generate too many tokens, it can easily fill the kv cache and trigger a 500 error.so I simply added a token control.

FAUST-BENCHOU · 2026-06-16T12:57:14Z

@hzxuzhonghu ptal

YaoZengzeng · 2026-06-18T06:35:32Z

+
+	route := utils.CreateModelRouteFromFile(t, ctx, testCtx.KthenaClient, plugincontext.TestDataDir, testNamespace, "ModelRoute-plugins.yaml")
+	model := route.Spec.ModelName
+	// Must fit sim kv-cache-size=8 blocks (block-size=8): prompt blocks + max_tokens blocks <= 8.


Isn't it >= 8?

no , <= 8 refers to the maximum kv-cache size of the sim (kv-cache-size=8), not a requirement of at least 8 blocks. We keep the prompt and max_tokens very small to avoid the prompt blocks + output blocks exceeding 8, which would result in a 500 error.

maybe need to update comment

Signed-off-by: zhoujinyu <2319109590@qq.com>

FAUST-BENCHOU added 2 commits June 16, 2026 12:59

add kvcache-aware plugin e2e test

a99fd49

Signed-off-by: zhoujinyu <2319109590@qq.com>

lint

b77a5eb

Signed-off-by: zhoujinyu <2319109590@qq.com>

Copilot AI review requested due to automatic review settings June 16, 2026 05:00

volcano-sh-bot added the do-not-merge/work-in-progress label Jun 16, 2026

volcano-sh-bot requested review from YaoZengzeng and hzxuzhonghu June 16, 2026 05:00

volcano-sh-bot added the size/L label Jun 16, 2026

Copilot AI reviewed Jun 16, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 16, 2026

View reviewed changes

move

9e639b7

Signed-off-by: zhoujinyu <2319109590@qq.com>

Copilot AI review requested due to automatic review settings June 16, 2026 06:28

Copilot AI reviewed Jun 16, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings June 16, 2026 08:39

Copilot AI reviewed Jun 16, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings June 16, 2026 09:56

Copilot AI reviewed Jun 16, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings June 16, 2026 10:22

Copilot AI reviewed Jun 16, 2026

View reviewed changes

upd

b111454

Signed-off-by: zhoujinyu <2319109590@qq.com>

FAUST-BENCHOU force-pushed the e2e/kvcache-aware branch from d50e36a to b111454 Compare June 16, 2026 10:33

FAUST-BENCHOU changed the title ~~E2e/kvcache aware~~ add kvcache aware plugin e2e test Jun 16, 2026

FAUST-BENCHOU marked this pull request as ready for review June 16, 2026 12:29

Copilot AI review requested due to automatic review settings June 16, 2026 12:29

volcano-sh-bot removed the do-not-merge/work-in-progress label Jun 16, 2026

Copilot AI reviewed Jun 16, 2026

View reviewed changes

FAUST-BENCHOU commented Jun 16, 2026

View reviewed changes

YaoZengzeng reviewed Jun 18, 2026

View reviewed changes

Comment thread test/e2e/router/plugins_test.go Outdated

YaoZengzeng reviewed Jun 18, 2026

View reviewed changes

Comment thread test/e2e/router/plugins_test.go Outdated

log

2bdf0db

Signed-off-by: zhoujinyu <2319109590@qq.com>

Conversation

FAUST-BENCHOU commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

volcano-sh-bot commented Jun 16, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FAUST-BENCHOU commented Jun 16, 2026 •

edited

Loading

FAUST-BENCHOU Jun 16, 2026 •

edited

Loading