Skip to content

merge autoscalingpolicybinding to autoscalingpolicy#1203

Open
LiZhenCheng9527 wants to merge 1 commit into
volcano-sh:mainfrom
LiZhenCheng9527:merge-crd
Open

merge autoscalingpolicybinding to autoscalingpolicy#1203
LiZhenCheng9527 wants to merge 1 commit into
volcano-sh:mainfrom
LiZhenCheng9527:merge-crd

Conversation

@LiZhenCheng9527

@LiZhenCheng9527 LiZhenCheng9527 commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

What type of PR is this?
/kind feature

What this PR does / why we need it:

#1172 describes that autoscalingPolicybinding needs to be merged into autoscalingPolicy to support the autoScaler with PD separation.

This PR merges two CRDs and modifies the code to ensure the functionality of the original features.

NOTE: the old SubTarget / per-role JSON-patch path is gone, so the autoscaler now only scales the whole ServingGroup. We will implement the scaling of roles in DisaggregatedTarget in a subsequent PR.

Which issue(s) this PR fixes:
A part of #905
Follow-up of #1172

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Copilot AI review requested due to automatic review settings June 15, 2026 08:09
@volcano-sh-bot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign git-malu, yaozengzeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the autoscaling architecture by removing the AutoscalingPolicyBinding custom resource definition and consolidating target configurations (homogeneousTarget, heterogeneousTarget, and disaggregatedTarget) directly into the AutoscalingPolicy specification. This change simplifies the user experience by managing autoscaling policies and targets within a single resource. The controller, CLI, webhooks, client-go code, documentation, and tests have been updated accordingly. Feedback on the changes highlights a critical bug where the revision hash is calculated using the original configuration instead of the final spec, which can prevent policy updates. Additionally, an improvement is suggested to default to one replica when instance.Spec.Replicas is nil, avoiding misleading 'unsupported' errors during reconciliation.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread pkg/model-booster-controller/convert/autoscaling.go Outdated
Comment thread pkg/autoscaler/controller/autoscale_controller.go

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR consolidates autoscaling configuration by removing the AutoscalingPolicyBinding CRD and moving target/binding configuration directly into AutoscalingPolicy, updating controllers, generated clients, tests, and documentation to align with the new single-resource API.

Changes:

  • Remove AutoscalingPolicyBinding API surface (CRD, types, webhook, CLI, client-go informers/listers) and migrate examples/docs/tests to AutoscalingPolicy-only configuration.
  • Update autoscaler and model-booster-controller flows to reconcile autoscaling using AutoscalingPolicy.spec.{homogeneousTarget,heterogeneousTarget,disaggregatedTarget} directly.
  • Extend AutoscalingPolicy API types with target fields and richer status structs (including disaggregated/ratio-related status types).

Reviewed changes

Copilot reviewed 60 out of 76 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
test/e2e/controller-manager/webhook_test.go Removes AutoscalingPolicyBinding webhook e2e coverage and creation path.
test/e2e/controller-manager/testdata/model-serving-sglang-mocker-binding-lifecycle.yaml Deletes binding-lifecycle e2e test ModelServing manifest.
test/e2e/controller-manager/testdata/autoscaling-policy-e2e-sglang-policy-lifecycle.yaml Adds homogeneousTarget directly onto AutoscalingPolicy testdata.
test/e2e/controller-manager/testdata/autoscaling-policy-e2e-sglang-binding-lifecycle.yaml Deletes binding-lifecycle AutoscalingPolicy testdata.
test/e2e/controller-manager/testdata/autoscaling-policy-binding-e2e-policy-lifecycle.yaml Deletes AutoscalingPolicyBinding testdata for policy lifecycle.
test/e2e/controller-manager/testdata/autoscaling-policy-binding-e2e-binding-lifecycle-b.yaml Deletes AutoscalingPolicyBinding testdata (binding lifecycle B).
test/e2e/controller-manager/testdata/autoscaling-policy-binding-e2e-binding-lifecycle-a.yaml Deletes AutoscalingPolicyBinding testdata (binding lifecycle A).
test/e2e/controller-manager/autoscaling_test.go Removes binding lifecycle e2e test; updates policy lifecycle to single resource; updates webhook test helpers.
pkg/model-booster-controller/webhook/README.md Removes AutoscalingPolicyBinding webhook endpoints from documentation.
pkg/model-booster-controller/webhook/autoscalingpolicy_validator.go Adds target validation to webhook (exactly one target + targetRef validation).
pkg/model-booster-controller/webhook/autoscalingpolicy_validator_test.go Updates tests to include required target fields on AutoscalingPolicy.
pkg/model-booster-controller/webhook/autoscaling_binding_validator.go Deletes AutoscalingPolicyBinding validator implementation.
pkg/model-booster-controller/webhook/autoscaling_binding_validator_test.go Deletes AutoscalingPolicyBinding validator tests.
pkg/model-booster-controller/convert/testdata/expected/scaling-asp.yaml Updates expected conversion output to single AutoscalingPolicy with target + metricSources.
pkg/model-booster-controller/convert/testdata/expected/scaling-asp-binding.yaml Deletes expected AutoscalingPolicyBinding conversion output.
pkg/model-booster-controller/convert/testdata/expected/optimize-asp.yaml Deletes expected optimize AutoscalingPolicy output (previous binding-based flow).
pkg/model-booster-controller/convert/testdata/expected/optimize-asp-binding.yaml Deletes expected optimize AutoscalingPolicyBinding output.
pkg/model-booster-controller/convert/autoscaling.go Embeds scaling target directly into generated AutoscalingPolicy objects.
pkg/model-booster-controller/convert/autoscaling_test.go Removes binding conversion unit test.
pkg/model-booster-controller/controller/model_booster_controller.go Removes informer/lister wiring and delete handler for AutoscalingPolicyBinding.
pkg/model-booster-controller/controller/model_booster_controller_test.go Updates reconcile expectations to only create AutoscalingPolicy (no binding).
pkg/model-booster-controller/controller/autoscaling_policy_handler.go Removes creation/update of AutoscalingPolicyBinding; keeps AutoscalingPolicy only.
pkg/autoscaler/util/client.go Removes role sub-target label selection behavior.
pkg/autoscaler/util/client_test.go Removes tests for role sub-target label selection.
pkg/autoscaler/controller/autoscale_controller.go Switches autoscaler reconcile loop from bindings to policies; skips disaggregatedTarget for now.
pkg/autoscaler/controller/autoscale_controller_test.go Updates tests to pass targets via AutoscalingPolicy rather than binding.
pkg/autoscaler/autoscaler/scaler.go Updates Autoscaler constructor/update logic to use policy-only configuration.
pkg/autoscaler/autoscaler/optimizer.go Updates Optimizer constructor/update logic to use policy-only configuration.
pkg/autoscaler/autoscaler/optimizer_test.go Updates optimizer meta tests to use AutoscalingPolicy instead of binding.
pkg/autoscaler/autoscaler/metric_collector.go Adjusts collector scope/ownership from binding UID to policy UID.
pkg/apis/workload/v1alpha1/zz_generated.deepcopy.go Removes AutoscalingPolicyBinding deep-copies; adds deep-copies for new target/status types.
pkg/apis/workload/v1alpha1/register.go Removes AutoscalingPolicyBinding types from scheme registration.
pkg/apis/workload/v1alpha1/autoscalingpolicybinding_types.go Deletes AutoscalingPolicyBinding API types.
pkg/apis/workload/v1alpha1/autoscalingpolicy_types.go Adds target fields (homogeneous/heterogeneous/disaggregated) + richer status types to AutoscalingPolicy.
examples/prometheus-autoscaler/serviceMonitor.yaml Adjusts Prometheus scrape configuration notes/labels after binding removal.
examples/prometheus-autoscaler/autoscalingbinding.yaml Deletes AutoscalingPolicyBinding example manifest.
examples/prometheus-autoscaler/autoscalerPolicy.yaml Adds target configuration directly into AutoscalingPolicy example.
docs/kthena/sidebars.ts Removes CLI doc entry for autoscaling-policy-bindings.
docs/kthena/docs/user-guide/autoscaler.md Updates user guide from 2-resource model to single AutoscalingPolicy target config.
docs/kthena/docs/reference/kthena-cli/kthena_get.md Removes CLI reference to kthena get autoscaling-policy-bindings.
docs/kthena/docs/reference/kthena-cli/kthena_get_autoscaling-policy-bindings.md Deletes CLI reference page for bindings.
docs/kthena/docs/reference/crd/workload.serving.volcano.sh.md Removes bindings from CRD reference; adds disaggregated target/status documentation.
docs/kthena/docs/architecture/model-booster-controller.md Updates architecture doc to remove AutoscalingPolicyBinding from controller outputs.
cmd/kthena-controller-manager/main.go Removes AutoscalingPolicyBinding webhook wiring and client initialization used by it.
client-go/listers/workload/v1alpha1/expansion_generated.go Removes lister expansion interfaces for bindings.
client-go/listers/workload/v1alpha1/autoscalingpolicybinding.go Deletes generated lister for bindings.
client-go/informers/externalversions/workload/v1alpha1/interface.go Removes binding informer from workload informer interface.
client-go/informers/externalversions/workload/v1alpha1/autoscalingpolicybinding.go Deletes generated informer for bindings.
client-go/informers/externalversions/generic.go Removes generic informer mapping for bindings.
client-go/clientset/versioned/typed/workload/v1alpha1/workload_client.go Removes typed client getter for bindings.
client-go/clientset/versioned/typed/workload/v1alpha1/generated_expansion.go Removes generated expansion for binding client.
client-go/clientset/versioned/typed/workload/v1alpha1/fake/fake_workload_client.go Removes fake typed client getter for bindings.
client-go/clientset/versioned/typed/workload/v1alpha1/fake/fake_autoscalingpolicybinding.go Deletes fake typed client implementation for bindings.
client-go/clientset/versioned/typed/workload/v1alpha1/autoscalingpolicybinding.go Deletes typed client implementation for bindings.
client-go/applyconfiguration/workload/v1alpha1/targetscalingstatus.go Adds applyconfiguration for new TargetScalingStatus.
client-go/applyconfiguration/workload/v1alpha1/target.go Removes SubTarget from applyconfiguration Target type.
client-go/applyconfiguration/workload/v1alpha1/subtarget.go Deletes applyconfiguration SubTarget type.
client-go/applyconfiguration/workload/v1alpha1/rolescalingparam.go Adds applyconfiguration for new RoleScalingParam.
client-go/applyconfiguration/workload/v1alpha1/roleratiostatus.go Adds applyconfiguration for new RoleRatioStatus.
client-go/applyconfiguration/workload/v1alpha1/roleratioconstraint.go Adds applyconfiguration for new RoleRatioConstraint.
client-go/applyconfiguration/workload/v1alpha1/disaggregatedtarget.go Adds applyconfiguration for new DisaggregatedTarget.
client-go/applyconfiguration/workload/v1alpha1/disaggregatedscalingstatus.go Adds applyconfiguration for new DisaggregatedScalingStatus.
client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicystatus.go Adds applyconfiguration for new AutoscalingPolicyStatus.
client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicyspec.go Extends applyconfiguration spec with target fields.
client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicybindingstatus.go Deletes applyconfiguration for binding status.
client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicybindingspec.go Deletes applyconfiguration for binding spec.
client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicybinding.go Deletes applyconfiguration for binding CR.
client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicy.go Updates AutoscalingPolicy applyconfiguration to use new status apply type.
client-go/applyconfiguration/utils.go Removes binding kinds and adds new target/status kinds to kind->apply mapping.
cli/kthena/cmd/get.go Removes kthena get autoscaling-policy-bindings command.
cli/kthena/cmd/create.go Removes CLI create support for AutoscalingPolicyBinding objects.
charts/kthena/charts/workload/templates/kthena-controller-manager/rbac/cluster-role.yaml Removes RBAC permissions for autoscalingpolicybindings resources.
charts/kthena/charts/workload/templates/kthena-controller-manager/component/validating-webhook.yaml Removes validating webhook registration for AutoscalingPolicyBinding.
charts/kthena/charts/workload/crds/workload.serving.volcano.sh_autoscalingpolicybindings.yaml Deletes CRD manifest for AutoscalingPolicyBinding from chart.
Files not reviewed (16)
  • client-go/applyconfiguration/utils.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicy.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicyspec.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicystatus.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/disaggregatedscalingstatus.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/disaggregatedtarget.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/roleratioconstraint.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/roleratiostatus.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/rolescalingparam.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/target.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/targetscalingstatus.go: Generated file
  • client-go/clientset/versioned/typed/workload/v1alpha1/fake/fake_workload_client.go: Generated file
  • client-go/clientset/versioned/typed/workload/v1alpha1/generated_expansion.go: Generated file
  • client-go/clientset/versioned/typed/workload/v1alpha1/workload_client.go: Generated file
  • client-go/informers/externalversions/generic.go: Generated file
  • client-go/informers/externalversions/workload/v1alpha1/interface.go: Generated file
Comments suppressed due to low confidence (1)

pkg/apis/workload/v1alpha1/zz_generated.deepcopy.go:1101

  • Target.DeepCopyInto has the same map iteration issue: val.DeepCopy() is called on a map iteration value (non-addressable) for map[string]MetricSource. MetricSource.DeepCopy has a pointer receiver, so this will not compile. Use a local variable before calling DeepCopy (or deep-copy the fields manually).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/apis/workload/v1alpha1/zz_generated.deepcopy.go
Comment thread pkg/model-booster-controller/convert/autoscaling.go
Comment thread pkg/model-booster-controller/webhook/autoscalingpolicy_validator.go
Comment thread pkg/autoscaler/controller/autoscale_controller.go
Comment thread pkg/apis/workload/v1alpha1/zz_generated.deepcopy.go

@hzxuzhonghu hzxuzhonghu left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on — collapsing the two CRDs into one really does make the API easier to reason about, and the doc comments + examples on the new types are excellent. I left a few inline notes. Most are small; the two I'd like to settle before merge are (1) the silent skip of DisaggregatedTarget and (2) confirming the removal of sub-target/role scaling is intended. Nothing blocking on the structure itself.

Comment thread pkg/model-booster-controller/convert/autoscaling.go Outdated
Comment thread pkg/model-booster-controller/convert/autoscaling.go
_, err = ac.client.WorkloadV1alpha1().ModelServings(namespaceScope).Patch(
ctx, targetRef.Name, types.JSONPatchType, patchBytes, metav1.PatchOptions{})
return err
if instance.Spec.Replicas != nil && *instance.Spec.Replicas == replicas {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to confirm this is intentional: the old SubTarget / per-role JSON-patch path is gone, so the autoscaler now only scales the whole ModelServing. I'm assuming role-level scaling is being handed over to DisaggregatedTarget (not wired up yet). If anyone is already relying on sub-target scaling with the current API, this would quietly stop scaling their roles — if that's possible, it probably deserves a release note.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A relevant description has already been added to the PR Description

scalerSet.Insert(formatAutoscalerMapKey(policy.Namespace, policy.Name, &policy.Spec.HomogeneousTarget.Target.TargetRef))
} else if policy.Spec.HeterogeneousTarget != nil {
optimizerSet.Insert(formatAutoscalerMapKey(policy.Namespace, policy.Name, nil))
} else if policy.Spec.DisaggregatedTarget != nil {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here (and again in schedule) a DisaggregatedTarget policy is accepted but only logged at V(4) and skipped — and the webhook admits it too. So a user can apply a 'valid' disaggregated policy and get nothing back, with no visible signal. Could we either reject it in the webhook for now, or surface a status condition / event like DisaggregatedTarget not yet supported? A silent no-op is hard to debug from the user side.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have adjusted the log level. Users will be notified.

allErrs = append(allErrs, v.validateTarget(policy)...)

// Require metrics for homogeneous/heterogeneous targets. DisaggregatedTarget may use per-role metrics.
if policy.Spec.HomogeneousTarget != nil || policy.Spec.HeterogeneousTarget != nil {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the type docs promise that spec.metrics and per-role metrics are mutually exclusive, and that a disaggregated policy must set one or the other. The webhook doesn't check either yet (a disaggregated policy with no metrics anywhere passes). Fine to defer until disaggregated is implemented — maybe just leave a // TODO here so it doesn't get lost.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO comment.

Copilot AI review requested due to automatic review settings June 16, 2026 06:52

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 60 out of 76 changed files in this pull request and generated 5 comments.

Files not reviewed (16)
  • client-go/applyconfiguration/utils.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicy.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicyspec.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicystatus.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/disaggregatedscalingstatus.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/disaggregatedtarget.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/roleratioconstraint.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/roleratiostatus.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/rolescalingparam.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/target.go: Generated file
  • client-go/applyconfiguration/workload/v1alpha1/targetscalingstatus.go: Generated file
  • client-go/clientset/versioned/typed/workload/v1alpha1/fake/fake_workload_client.go: Generated file
  • client-go/clientset/versioned/typed/workload/v1alpha1/generated_expansion.go: Generated file
  • client-go/clientset/versioned/typed/workload/v1alpha1/workload_client.go: Generated file
  • client-go/informers/externalversions/generic.go: Generated file
  • client-go/informers/externalversions/workload/v1alpha1/interface.go: Generated file

Comment thread pkg/apis/workload/v1alpha1/zz_generated.deepcopy.go
Comment thread pkg/apis/workload/v1alpha1/zz_generated.deepcopy.go
Comment thread pkg/model-booster-controller/convert/autoscaling.go
Comment thread docs/kthena/docs/user-guide/autoscaler.md Outdated
Comment thread pkg/apis/workload/v1alpha1/zz_generated.deepcopy.go
Signed-off-by: LiZhenCheng9527 <lizhencheng6@huawei.com>
// exclusive: set spec.metrics to scale every role on the same signals, or
// set metrics on every role and leave spec.metrics empty.
// +optional
Metrics []AutoscalingPolicyMetric `json:"metrics,omitempty"`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because AutoscalingPolicySpec is embedded in ModelBooster, making metrics optional here also lets users create a ModelBooster with spec.autoscalingPolicy but no metrics. The controller then generates a homogeneous AutoscalingPolicy, and the AutoscalingPolicy webhook rejects it because homogeneous targets still require at least one metric. Could we add ModelBooster validation for this case, or keep the embedded schema from accepting an empty metrics list for ModelBooster?

@hzxuzhonghu

Copy link
Copy Markdown
Member

Please fill out the Does this PR introduce a user-facing change?, which should be also highlighted in release note

`

num_requests_waiting:
type: Pod
pod:
name: num_requests_waiting

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THis is not working, please note this is the userguide, not a PLACEHOLDER

type TargetScalingStatus struct {
// Name identifies the unit. For HomogeneousTarget it is the ModelServing
// name; for a role it is the role name.
Name string `json:"name"`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for HomogeneousTarget, the name can be empty


// RatioStatus reports the observed value of the configured ratio constraint.
// +optional
RatioStatus *RoleRatioStatus `json:"ratioStatus,omitempty"`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are three roles, the single ratio cannot express them

type MetricSource struct {
// Type selects the metric source backend.
// +kubebuilder:default="Pod"
Type MetricSourceType `json:"type"`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As both Pod and Prometheus are of pointer type, so this Type seems redudant

//
// NOTE: Not yet implemented; reserved for future use. The runtime currently
// uses default TLS verification regardless of these fields.
type PrometheusTLSConfig struct {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove if this is not added into PrometheusAuth

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants