Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support bluegreen release: rollout controller #232

Merged
merged 1 commit into from
Nov 13, 2024

Conversation

myname4423
Copy link
Contributor

@myname4423 myname4423 commented Sep 12, 2024

Ⅰ. Describe what this PR does

the first patch is about rollout controller

  1. add a bluegreen release manager for bluegreen
  2. move some common functions of both bluegreen and canary to releaseManager.go

Ⅱ. Does this pull request fix one issue?

Ⅲ. Special notes for reviews

@myname4423 myname4423 changed the title support bluegreen release: webhook update support bluegreen release: rollout controller Sep 13, 2024
tr.RecheckDuration = time.Duration(trafficrouting.GetGraceSeconds(c.Rollout.Spec.Strategy.GetTrafficRouting(), defaultGracePeriodSeconds)) * time.Second
}
expectedTime := time.Now().Add(tr.RecheckDuration)
expectedTime := time.Now().Add(time.Duration(defaultGracePeriodSeconds) * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change the code here ? if one set TrafficRoutingRef.GracePeriodSeconds, we should respect the user setting

Copy link
Contributor Author

@myname4423 myname4423 Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How the tr.RecheckDuration works:
if we do some traffic-related operations (eg. some functions in manager.go), the operations will update the tr.RecheckDuration automatically, if operation A need to recheck 300ms later, operations B need to recheck 500ms later, then tr.RecheckDuration will be the max of 300ms and 500ms. However, the initial value of tr.RecheckDuration is 0, if we doesn't do any traffic-related operations, the tr.RecheckDuration will keep the initial value. When RecheckTime is 0, requeue won't happen. For function DoTrafficRouting, it won't always update tr.RecheckDuration, thus the tr.RecheckDuration may be 0. That's why i added a check (the removed lines) before. However, I realize that

  1. the check won't save significant time due to the complex logic in the DoTrafficRouting function
  2. this check is a little hard to understand for others, that's why i removed these lines and use the original defaultGracePeriodSeconds again (ie. the removed are lines added by me in a recent commit, and the line to add is the original logic)

@@ -506,7 +505,10 @@ func (r *RolloutReconciler) recalculateCanaryStep(c *RolloutContext) (int32, err
if c.NewStatus != nil {
currentIndex = c.NewStatus.GetSubStatus().CurrentStepIndex - 1
}
steps := append([]int{}, int(currentIndex))
steps := make([]int, 0)
if ci := int(currentIndex); ci >= 0 && ci < len(c.Rollout.Spec.Strategy.GetSteps()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz comment when ci will be less than zero ?

Copy link
Contributor Author

@myname4423 myname4423 Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's unlikely to be less than 0, just to make sure the slice doesn't go out of index

pkg/util/controller_finder.go Outdated Show resolved Hide resolved
pkg/util/controller_finder.go Outdated Show resolved Hide resolved
pkg/controller/rollout/rollout_bluegreen.go Outdated Show resolved Hide resolved
pkg/controller/rollout/rollout_bluegreen.go Outdated Show resolved Hide resolved
pkg/controller/rollout/rollout_bluegreen.go Outdated Show resolved Hide resolved
pkg/controller/rollout/rollout_bluegreen.go Outdated Show resolved Hide resolved
case v1beta1.FinalisingStepTypeRemoveCanaryService:
retry, err = m.trafficRoutingManager.RemoveCanaryService(tr)
// route all traffic to new version
case v1beta1.FinalisingStepTypeRouteAllTraffic:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider rename these finalizing steps so that the name can be more differentiating. now it is very hard to tell the difference between FinalisingStepTypeRouteAllTraffic and FinalisingStepTypeGateway, and it is hard to guess the meaning of step FinalisingStepTypeBatchRelease and FinalisingStepTypeGateway, consider rename them to
FinalisingStepTypeRestoreWorkload and FinalisingStepTypeRestoreGateway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok!

// if current step is empty, set it with the first step
// if current step is end, we just return
if len(blueGreenStatus.FinalisingStep) == 0 {
blueGreenStatus.FinalisingStep = nextStep
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to move finalizing status to rollotu.status.conditions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the finished finalizing steps in saved in rollout.status.conditions, it is still possible to determine the current step in nextBlueGreenTask by iterating over taskSequence and find the last recorded step

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i'll implement this logic in the coming patch

// Route all traffic to new version (for bluegreen)
FinalisingStepRouteTrafficToNew FinalisingStepType = "RouteAllTraffic"
// Restore the GatewayAPI/Ingress/Istio
FinalisingStepRouteTrafficToStable FinalisingStepType = "RestoreGateway"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after change the variable name, can we also update the variable value ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return true, br, nil
}
// update batchRelease to the latest version
if err = retry.RetryOnConflict(retry.DefaultBackoff, func() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz avoid retry directly in the reconcile loop, one can always retry by workerqueue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

api/v1beta1/rollout_types.go Outdated Show resolved Hide resolved
pkg/controller/rollout/rollout_releaseManager.go Outdated Show resolved Hide resolved
Copy link
Member

@furykerry furykerry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@furykerry furykerry merged commit 3f66aae into openkruise:master Nov 13, 2024
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants