merge autoscalingpolicybinding to autoscalingpolicy#1203
merge autoscalingpolicybinding to autoscalingpolicy#1203LiZhenCheng9527 wants to merge 1 commit into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request refactors the autoscaling architecture by removing the AutoscalingPolicyBinding custom resource definition and consolidating target configurations (homogeneousTarget, heterogeneousTarget, and disaggregatedTarget) directly into the AutoscalingPolicy specification. This change simplifies the user experience by managing autoscaling policies and targets within a single resource. The controller, CLI, webhooks, client-go code, documentation, and tests have been updated accordingly. Feedback on the changes highlights a critical bug where the revision hash is calculated using the original configuration instead of the final spec, which can prevent policy updates. Additionally, an improvement is suggested to default to one replica when instance.Spec.Replicas is nil, avoiding misleading 'unsupported' errors during reconciliation.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
Pull request overview
This PR consolidates autoscaling configuration by removing the AutoscalingPolicyBinding CRD and moving target/binding configuration directly into AutoscalingPolicy, updating controllers, generated clients, tests, and documentation to align with the new single-resource API.
Changes:
- Remove
AutoscalingPolicyBindingAPI surface (CRD, types, webhook, CLI, client-go informers/listers) and migrate examples/docs/tests toAutoscalingPolicy-only configuration. - Update autoscaler and model-booster-controller flows to reconcile autoscaling using
AutoscalingPolicy.spec.{homogeneousTarget,heterogeneousTarget,disaggregatedTarget}directly. - Extend
AutoscalingPolicyAPI types with target fields and richer status structs (including disaggregated/ratio-related status types).
Reviewed changes
Copilot reviewed 60 out of 76 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test/e2e/controller-manager/webhook_test.go | Removes AutoscalingPolicyBinding webhook e2e coverage and creation path. |
| test/e2e/controller-manager/testdata/model-serving-sglang-mocker-binding-lifecycle.yaml | Deletes binding-lifecycle e2e test ModelServing manifest. |
| test/e2e/controller-manager/testdata/autoscaling-policy-e2e-sglang-policy-lifecycle.yaml | Adds homogeneousTarget directly onto AutoscalingPolicy testdata. |
| test/e2e/controller-manager/testdata/autoscaling-policy-e2e-sglang-binding-lifecycle.yaml | Deletes binding-lifecycle AutoscalingPolicy testdata. |
| test/e2e/controller-manager/testdata/autoscaling-policy-binding-e2e-policy-lifecycle.yaml | Deletes AutoscalingPolicyBinding testdata for policy lifecycle. |
| test/e2e/controller-manager/testdata/autoscaling-policy-binding-e2e-binding-lifecycle-b.yaml | Deletes AutoscalingPolicyBinding testdata (binding lifecycle B). |
| test/e2e/controller-manager/testdata/autoscaling-policy-binding-e2e-binding-lifecycle-a.yaml | Deletes AutoscalingPolicyBinding testdata (binding lifecycle A). |
| test/e2e/controller-manager/autoscaling_test.go | Removes binding lifecycle e2e test; updates policy lifecycle to single resource; updates webhook test helpers. |
| pkg/model-booster-controller/webhook/README.md | Removes AutoscalingPolicyBinding webhook endpoints from documentation. |
| pkg/model-booster-controller/webhook/autoscalingpolicy_validator.go | Adds target validation to webhook (exactly one target + targetRef validation). |
| pkg/model-booster-controller/webhook/autoscalingpolicy_validator_test.go | Updates tests to include required target fields on AutoscalingPolicy. |
| pkg/model-booster-controller/webhook/autoscaling_binding_validator.go | Deletes AutoscalingPolicyBinding validator implementation. |
| pkg/model-booster-controller/webhook/autoscaling_binding_validator_test.go | Deletes AutoscalingPolicyBinding validator tests. |
| pkg/model-booster-controller/convert/testdata/expected/scaling-asp.yaml | Updates expected conversion output to single AutoscalingPolicy with target + metricSources. |
| pkg/model-booster-controller/convert/testdata/expected/scaling-asp-binding.yaml | Deletes expected AutoscalingPolicyBinding conversion output. |
| pkg/model-booster-controller/convert/testdata/expected/optimize-asp.yaml | Deletes expected optimize AutoscalingPolicy output (previous binding-based flow). |
| pkg/model-booster-controller/convert/testdata/expected/optimize-asp-binding.yaml | Deletes expected optimize AutoscalingPolicyBinding output. |
| pkg/model-booster-controller/convert/autoscaling.go | Embeds scaling target directly into generated AutoscalingPolicy objects. |
| pkg/model-booster-controller/convert/autoscaling_test.go | Removes binding conversion unit test. |
| pkg/model-booster-controller/controller/model_booster_controller.go | Removes informer/lister wiring and delete handler for AutoscalingPolicyBinding. |
| pkg/model-booster-controller/controller/model_booster_controller_test.go | Updates reconcile expectations to only create AutoscalingPolicy (no binding). |
| pkg/model-booster-controller/controller/autoscaling_policy_handler.go | Removes creation/update of AutoscalingPolicyBinding; keeps AutoscalingPolicy only. |
| pkg/autoscaler/util/client.go | Removes role sub-target label selection behavior. |
| pkg/autoscaler/util/client_test.go | Removes tests for role sub-target label selection. |
| pkg/autoscaler/controller/autoscale_controller.go | Switches autoscaler reconcile loop from bindings to policies; skips disaggregatedTarget for now. |
| pkg/autoscaler/controller/autoscale_controller_test.go | Updates tests to pass targets via AutoscalingPolicy rather than binding. |
| pkg/autoscaler/autoscaler/scaler.go | Updates Autoscaler constructor/update logic to use policy-only configuration. |
| pkg/autoscaler/autoscaler/optimizer.go | Updates Optimizer constructor/update logic to use policy-only configuration. |
| pkg/autoscaler/autoscaler/optimizer_test.go | Updates optimizer meta tests to use AutoscalingPolicy instead of binding. |
| pkg/autoscaler/autoscaler/metric_collector.go | Adjusts collector scope/ownership from binding UID to policy UID. |
| pkg/apis/workload/v1alpha1/zz_generated.deepcopy.go | Removes AutoscalingPolicyBinding deep-copies; adds deep-copies for new target/status types. |
| pkg/apis/workload/v1alpha1/register.go | Removes AutoscalingPolicyBinding types from scheme registration. |
| pkg/apis/workload/v1alpha1/autoscalingpolicybinding_types.go | Deletes AutoscalingPolicyBinding API types. |
| pkg/apis/workload/v1alpha1/autoscalingpolicy_types.go | Adds target fields (homogeneous/heterogeneous/disaggregated) + richer status types to AutoscalingPolicy. |
| examples/prometheus-autoscaler/serviceMonitor.yaml | Adjusts Prometheus scrape configuration notes/labels after binding removal. |
| examples/prometheus-autoscaler/autoscalingbinding.yaml | Deletes AutoscalingPolicyBinding example manifest. |
| examples/prometheus-autoscaler/autoscalerPolicy.yaml | Adds target configuration directly into AutoscalingPolicy example. |
| docs/kthena/sidebars.ts | Removes CLI doc entry for autoscaling-policy-bindings. |
| docs/kthena/docs/user-guide/autoscaler.md | Updates user guide from 2-resource model to single AutoscalingPolicy target config. |
| docs/kthena/docs/reference/kthena-cli/kthena_get.md | Removes CLI reference to kthena get autoscaling-policy-bindings. |
| docs/kthena/docs/reference/kthena-cli/kthena_get_autoscaling-policy-bindings.md | Deletes CLI reference page for bindings. |
| docs/kthena/docs/reference/crd/workload.serving.volcano.sh.md | Removes bindings from CRD reference; adds disaggregated target/status documentation. |
| docs/kthena/docs/architecture/model-booster-controller.md | Updates architecture doc to remove AutoscalingPolicyBinding from controller outputs. |
| cmd/kthena-controller-manager/main.go | Removes AutoscalingPolicyBinding webhook wiring and client initialization used by it. |
| client-go/listers/workload/v1alpha1/expansion_generated.go | Removes lister expansion interfaces for bindings. |
| client-go/listers/workload/v1alpha1/autoscalingpolicybinding.go | Deletes generated lister for bindings. |
| client-go/informers/externalversions/workload/v1alpha1/interface.go | Removes binding informer from workload informer interface. |
| client-go/informers/externalversions/workload/v1alpha1/autoscalingpolicybinding.go | Deletes generated informer for bindings. |
| client-go/informers/externalversions/generic.go | Removes generic informer mapping for bindings. |
| client-go/clientset/versioned/typed/workload/v1alpha1/workload_client.go | Removes typed client getter for bindings. |
| client-go/clientset/versioned/typed/workload/v1alpha1/generated_expansion.go | Removes generated expansion for binding client. |
| client-go/clientset/versioned/typed/workload/v1alpha1/fake/fake_workload_client.go | Removes fake typed client getter for bindings. |
| client-go/clientset/versioned/typed/workload/v1alpha1/fake/fake_autoscalingpolicybinding.go | Deletes fake typed client implementation for bindings. |
| client-go/clientset/versioned/typed/workload/v1alpha1/autoscalingpolicybinding.go | Deletes typed client implementation for bindings. |
| client-go/applyconfiguration/workload/v1alpha1/targetscalingstatus.go | Adds applyconfiguration for new TargetScalingStatus. |
| client-go/applyconfiguration/workload/v1alpha1/target.go | Removes SubTarget from applyconfiguration Target type. |
| client-go/applyconfiguration/workload/v1alpha1/subtarget.go | Deletes applyconfiguration SubTarget type. |
| client-go/applyconfiguration/workload/v1alpha1/rolescalingparam.go | Adds applyconfiguration for new RoleScalingParam. |
| client-go/applyconfiguration/workload/v1alpha1/roleratiostatus.go | Adds applyconfiguration for new RoleRatioStatus. |
| client-go/applyconfiguration/workload/v1alpha1/roleratioconstraint.go | Adds applyconfiguration for new RoleRatioConstraint. |
| client-go/applyconfiguration/workload/v1alpha1/disaggregatedtarget.go | Adds applyconfiguration for new DisaggregatedTarget. |
| client-go/applyconfiguration/workload/v1alpha1/disaggregatedscalingstatus.go | Adds applyconfiguration for new DisaggregatedScalingStatus. |
| client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicystatus.go | Adds applyconfiguration for new AutoscalingPolicyStatus. |
| client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicyspec.go | Extends applyconfiguration spec with target fields. |
| client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicybindingstatus.go | Deletes applyconfiguration for binding status. |
| client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicybindingspec.go | Deletes applyconfiguration for binding spec. |
| client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicybinding.go | Deletes applyconfiguration for binding CR. |
| client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicy.go | Updates AutoscalingPolicy applyconfiguration to use new status apply type. |
| client-go/applyconfiguration/utils.go | Removes binding kinds and adds new target/status kinds to kind->apply mapping. |
| cli/kthena/cmd/get.go | Removes kthena get autoscaling-policy-bindings command. |
| cli/kthena/cmd/create.go | Removes CLI create support for AutoscalingPolicyBinding objects. |
| charts/kthena/charts/workload/templates/kthena-controller-manager/rbac/cluster-role.yaml | Removes RBAC permissions for autoscalingpolicybindings resources. |
| charts/kthena/charts/workload/templates/kthena-controller-manager/component/validating-webhook.yaml | Removes validating webhook registration for AutoscalingPolicyBinding. |
| charts/kthena/charts/workload/crds/workload.serving.volcano.sh_autoscalingpolicybindings.yaml | Deletes CRD manifest for AutoscalingPolicyBinding from chart. |
Files not reviewed (16)
- client-go/applyconfiguration/utils.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicy.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicyspec.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicystatus.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/disaggregatedscalingstatus.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/disaggregatedtarget.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/roleratioconstraint.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/roleratiostatus.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/rolescalingparam.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/target.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/targetscalingstatus.go: Generated file
- client-go/clientset/versioned/typed/workload/v1alpha1/fake/fake_workload_client.go: Generated file
- client-go/clientset/versioned/typed/workload/v1alpha1/generated_expansion.go: Generated file
- client-go/clientset/versioned/typed/workload/v1alpha1/workload_client.go: Generated file
- client-go/informers/externalversions/generic.go: Generated file
- client-go/informers/externalversions/workload/v1alpha1/interface.go: Generated file
Comments suppressed due to low confidence (1)
pkg/apis/workload/v1alpha1/zz_generated.deepcopy.go:1101
- Target.DeepCopyInto has the same map iteration issue:
val.DeepCopy()is called on a map iteration value (non-addressable) for map[string]MetricSource. MetricSource.DeepCopy has a pointer receiver, so this will not compile. Use a local variable before calling DeepCopy (or deep-copy the fields manually).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1bbd59f to
10e2fdd
Compare
hzxuzhonghu
left a comment
There was a problem hiding this comment.
Thanks for taking this on — collapsing the two CRDs into one really does make the API easier to reason about, and the doc comments + examples on the new types are excellent. I left a few inline notes. Most are small; the two I'd like to settle before merge are (1) the silent skip of DisaggregatedTarget and (2) confirming the removal of sub-target/role scaling is intended. Nothing blocking on the structure itself.
| _, err = ac.client.WorkloadV1alpha1().ModelServings(namespaceScope).Patch( | ||
| ctx, targetRef.Name, types.JSONPatchType, patchBytes, metav1.PatchOptions{}) | ||
| return err | ||
| if instance.Spec.Replicas != nil && *instance.Spec.Replicas == replicas { |
There was a problem hiding this comment.
Want to confirm this is intentional: the old SubTarget / per-role JSON-patch path is gone, so the autoscaler now only scales the whole ModelServing. I'm assuming role-level scaling is being handed over to DisaggregatedTarget (not wired up yet). If anyone is already relying on sub-target scaling with the current API, this would quietly stop scaling their roles — if that's possible, it probably deserves a release note.
There was a problem hiding this comment.
A relevant description has already been added to the PR Description
| scalerSet.Insert(formatAutoscalerMapKey(policy.Namespace, policy.Name, &policy.Spec.HomogeneousTarget.Target.TargetRef)) | ||
| } else if policy.Spec.HeterogeneousTarget != nil { | ||
| optimizerSet.Insert(formatAutoscalerMapKey(policy.Namespace, policy.Name, nil)) | ||
| } else if policy.Spec.DisaggregatedTarget != nil { |
There was a problem hiding this comment.
Here (and again in schedule) a DisaggregatedTarget policy is accepted but only logged at V(4) and skipped — and the webhook admits it too. So a user can apply a 'valid' disaggregated policy and get nothing back, with no visible signal. Could we either reject it in the webhook for now, or surface a status condition / event like DisaggregatedTarget not yet supported? A silent no-op is hard to debug from the user side.
There was a problem hiding this comment.
I have adjusted the log level. Users will be notified.
| allErrs = append(allErrs, v.validateTarget(policy)...) | ||
|
|
||
| // Require metrics for homogeneous/heterogeneous targets. DisaggregatedTarget may use per-role metrics. | ||
| if policy.Spec.HomogeneousTarget != nil || policy.Spec.HeterogeneousTarget != nil { |
There was a problem hiding this comment.
nit: the type docs promise that spec.metrics and per-role metrics are mutually exclusive, and that a disaggregated policy must set one or the other. The webhook doesn't check either yet (a disaggregated policy with no metrics anywhere passes). Fine to defer until disaggregated is implemented — maybe just leave a // TODO here so it doesn't get lost.
There was a problem hiding this comment.
Add a TODO comment.
10e2fdd to
34609c0
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 60 out of 76 changed files in this pull request and generated 5 comments.
Files not reviewed (16)
- client-go/applyconfiguration/utils.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicy.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicyspec.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/autoscalingpolicystatus.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/disaggregatedscalingstatus.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/disaggregatedtarget.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/roleratioconstraint.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/roleratiostatus.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/rolescalingparam.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/target.go: Generated file
- client-go/applyconfiguration/workload/v1alpha1/targetscalingstatus.go: Generated file
- client-go/clientset/versioned/typed/workload/v1alpha1/fake/fake_workload_client.go: Generated file
- client-go/clientset/versioned/typed/workload/v1alpha1/generated_expansion.go: Generated file
- client-go/clientset/versioned/typed/workload/v1alpha1/workload_client.go: Generated file
- client-go/informers/externalversions/generic.go: Generated file
- client-go/informers/externalversions/workload/v1alpha1/interface.go: Generated file
Signed-off-by: LiZhenCheng9527 <lizhencheng6@huawei.com>
34609c0 to
d0b2550
Compare
| // exclusive: set spec.metrics to scale every role on the same signals, or | ||
| // set metrics on every role and leave spec.metrics empty. | ||
| // +optional | ||
| Metrics []AutoscalingPolicyMetric `json:"metrics,omitempty"` |
There was a problem hiding this comment.
Because AutoscalingPolicySpec is embedded in ModelBooster, making metrics optional here also lets users create a ModelBooster with spec.autoscalingPolicy but no metrics. The controller then generates a homogeneous AutoscalingPolicy, and the AutoscalingPolicy webhook rejects it because homogeneous targets still require at least one metric. Could we add ModelBooster validation for this case, or keep the embedded schema from accepting an empty metrics list for ModelBooster?
|
Please fill out the Does this PR introduce a user-facing change?, which should be also highlighted in release note ` |
| num_requests_waiting: | ||
| type: Pod | ||
| pod: | ||
| name: num_requests_waiting |
There was a problem hiding this comment.
THis is not working, please note this is the userguide, not a PLACEHOLDER
| type TargetScalingStatus struct { | ||
| // Name identifies the unit. For HomogeneousTarget it is the ModelServing | ||
| // name; for a role it is the role name. | ||
| Name string `json:"name"` |
There was a problem hiding this comment.
for HomogeneousTarget, the name can be empty
|
|
||
| // RatioStatus reports the observed value of the configured ratio constraint. | ||
| // +optional | ||
| RatioStatus *RoleRatioStatus `json:"ratioStatus,omitempty"` |
There was a problem hiding this comment.
If there are three roles, the single ratio cannot express them
| type MetricSource struct { | ||
| // Type selects the metric source backend. | ||
| // +kubebuilder:default="Pod" | ||
| Type MetricSourceType `json:"type"` |
There was a problem hiding this comment.
As both Pod and Prometheus are of pointer type, so this Type seems redudant
| // | ||
| // NOTE: Not yet implemented; reserved for future use. The runtime currently | ||
| // uses default TLS verification regardless of these fields. | ||
| type PrometheusTLSConfig struct { |
There was a problem hiding this comment.
Please remove if this is not added into PrometheusAuth
What type of PR is this?
/kind feature
What this PR does / why we need it:
#1172 describes that autoscalingPolicybinding needs to be merged into autoscalingPolicy to support the autoScaler with PD separation.
This PR merges two CRDs and modifies the code to ensure the functionality of the original features.
NOTE: the old SubTarget / per-role JSON-patch path is gone, so the autoscaler now only scales the whole ServingGroup. We will implement the scaling of roles in DisaggregatedTarget in a subsequent PR.
Which issue(s) this PR fixes:
A part of #905
Follow-up of #1172
Special notes for your reviewer:
Does this PR introduce a user-facing change?: