-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support bluegreen release: rollout controller #232
Conversation
bb51872
to
dcc116c
Compare
tr.RecheckDuration = time.Duration(trafficrouting.GetGraceSeconds(c.Rollout.Spec.Strategy.GetTrafficRouting(), defaultGracePeriodSeconds)) * time.Second | ||
} | ||
expectedTime := time.Now().Add(tr.RecheckDuration) | ||
expectedTime := time.Now().Add(time.Duration(defaultGracePeriodSeconds) * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change the code here ? if one set TrafficRoutingRef.GracePeriodSeconds, we should respect the user setting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How the tr.RecheckDuration works:
if we do some traffic-related operations (eg. some functions in manager.go), the operations will update the tr.RecheckDuration automatically, if operation A need to recheck 300ms later, operations B need to recheck 500ms later, then tr.RecheckDuration will be the max of 300ms and 500ms. However, the initial value of tr.RecheckDuration is 0, if we doesn't do any traffic-related operations, the tr.RecheckDuration will keep the initial value. When RecheckTime is 0, requeue won't happen. For function DoTrafficRouting, it won't always update tr.RecheckDuration, thus the tr.RecheckDuration may be 0. That's why i added a check (the removed lines) before. However, I realize that
- the check won't save significant time due to the complex logic in the DoTrafficRouting function
- this check is a little hard to understand for others, that's why i removed these lines and use the original defaultGracePeriodSeconds again (ie. the removed are lines added by me in a recent commit, and the line to add is the original logic)
@@ -506,7 +505,10 @@ func (r *RolloutReconciler) recalculateCanaryStep(c *RolloutContext) (int32, err | |||
if c.NewStatus != nil { | |||
currentIndex = c.NewStatus.GetSubStatus().CurrentStepIndex - 1 | |||
} | |||
steps := append([]int{}, int(currentIndex)) | |||
steps := make([]int, 0) | |||
if ci := int(currentIndex); ci >= 0 && ci < len(c.Rollout.Spec.Strategy.GetSteps()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz comment when ci will be less than zero ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it's unlikely to be less than 0, just to make sure the slice doesn't go out of index
case v1beta1.FinalisingStepTypeRemoveCanaryService: | ||
retry, err = m.trafficRoutingManager.RemoveCanaryService(tr) | ||
// route all traffic to new version | ||
case v1beta1.FinalisingStepTypeRouteAllTraffic: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider rename these finalizing steps so that the name can be more differentiating. now it is very hard to tell the difference between FinalisingStepTypeRouteAllTraffic and FinalisingStepTypeGateway, and it is hard to guess the meaning of step FinalisingStepTypeBatchRelease and FinalisingStepTypeGateway, consider rename them to
FinalisingStepTypeRestoreWorkload and FinalisingStepTypeRestoreGateway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok!
// if current step is empty, set it with the first step | ||
// if current step is end, we just return | ||
if len(blueGreenStatus.FinalisingStep) == 0 { | ||
blueGreenStatus.FinalisingStep = nextStep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to move finalizing status to rollotu.status.conditions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the finished finalizing steps in saved in rollout.status.conditions, it is still possible to determine the current step in nextBlueGreenTask by iterating over taskSequence and find the last recorded step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i'll implement this logic in the coming patch
api/v1beta1/rollout_types.go
Outdated
// Route all traffic to new version (for bluegreen) | ||
FinalisingStepRouteTrafficToNew FinalisingStepType = "RouteAllTraffic" | ||
// Restore the GatewayAPI/Ingress/Istio | ||
FinalisingStepRouteTrafficToStable FinalisingStepType = "RestoreGateway" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after change the variable name, can we also update the variable value ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return true, br, nil | ||
} | ||
// update batchRelease to the latest version | ||
if err = retry.RetryOnConflict(retry.DefaultBackoff, func() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz avoid retry directly in the reconcile loop, one can always retry by workerqueue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Signed-off-by: yunbo <[email protected]>
6425b77
to
22cc791
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Ⅰ. Describe what this PR does
the first patch is about rollout controller
Ⅱ. Does this pull request fix one issue?
Ⅲ. Special notes for reviews