Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CC-30640] allow manual version control for standard clusters #252

Conversation

fantapop
Copy link
Collaborator

@fantapop fantapop commented Dec 3, 2024

STANDARD clusters now accept the cockroach_version attribute on cluster update for manual upgrades control. The serverless.upgrade_type attribute must be set to MANUAL.

As part of this an API call to validate cockraoch_version for ADVANCED clusters prior to the update was removed. This logic provided some nicer error messages for advanced clusters but the logic was complex, it required an additional api call to fetch the valid versions and standard tier version checking is not supported. We now rely on the serverside validation for checking versions.

Commit checklist

  • Changelog
  • Doc gen (make generate)
  • Integration test(s)
  • Acceptance test(s)
  • Example(s)

@@ -975,6 +982,17 @@ func TestIntegrationServerlessClusterResource(t *testing.T) {
},
finalCluster: singleRegionClusterWithUnlimited("BASIC"),
},
{
name: "upgrade cockroach_version for STANDARD cluster",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only real automated test for a successful standard tier upgrade. Our api doesn't allow setting up these tests since we need to be able to build a host cluster and then upgrade it. For that reason, we're currently limited to integration tests here. Given the complication of upgrades it would be great to figure out a way to build an acceptance test for this. @andy-kimball Do you have any ideas for that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alyshanjahani-crl, do we support creating a Serverless cluster at the previous version? If so, we could do that, and then upgrade it to the next version, all without touching the host cluster version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're currently limited creating serverless clusters at the latest version. This is why cockroach_version is not accepted for serverless clusters during the Create phase. But some kind of ability in the future would be welcome. We could hide it behind a feature flag if need be.

@@ -439,6 +438,23 @@ func (r *clusterResource) Create(
}
}

if IsKnown(plan.CockroachVersion) {
var planType client.PlanType
if clusterSpec.Plan != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plan type derivation logic seems like something that should be a utility function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've moved it out to its own function and added a test for it.

break
if state.ServerlessConfig != nil {
stateUpgradeTypeString := state.ServerlessConfig.UpgradeType.ValueString()
planUpgradeTypeString := plan.ServerlessConfig.UpgradeType.ValueString()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it guaranteed that plan.ServerlessConfig will be non-nil here?

Copy link
Collaborator Author

@fantapop fantapop Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know this so I wrote a test to try it out. Apparently we're protected against this here. ModifyPlan is run before the update should it should never be the case that state.ServerlessConfig is set but plan.ServerlessConfig is not. I went ahead and left the test in.

if stateUpgradeTypeString != planUpgradeTypeString {
resp.Diagnostics.AddError("Error updating cluster", "upgrade_type and version must not be changed during the same plan.")
return
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: I'd just get rid of the else since you're already returning from the if.

@@ -975,6 +982,17 @@ func TestIntegrationServerlessClusterResource(t *testing.T) {
},
finalCluster: singleRegionClusterWithUnlimited("BASIC"),
},
{
name: "upgrade cockroach_version for STANDARD cluster",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alyshanjahani-crl, do we support creating a Serverless cluster at the previous version? If so, we could do that, and then upgrade it to the next version, all without touching the host cluster version.

internal/provider/cluster_resource_test.go Outdated Show resolved Hide resolved
Copy link

@kathancox kathancox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! One tiny nit, and a small comment/question.

internal/provider/cluster_resource.go Show resolved Hide resolved
} else {
stateUpgradeTypePtr, err := client.NewUpgradeTypeTypeFromValue(stateUpgradeTypeString)
if err != nil {
resp.Diagnostics.AddError("Invalid value for upgrade_type found in state", err.Error())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
resp.Diagnostics.AddError("Invalid value for upgrade_type found in state", err.Error())
resp.Diagnostics.AddError("Invalid value for upgrade_type found in state.", err.Error())

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking closer at this, I noticed this is the "error summary" field. We generally don't add periods for the summary. But it does seem like this summary is probably too specific. Here is what would be returned by err.Error():

"invalid value 'BAD_VALUE' for UpgradeTypeType: valid values are ['AUTOMATIC', 'MANUAL']

So the error would should up as:

"Invalid value for upgrade_type found in state"
"invalid value 'BAD_VALUE' for UpgradeTypeType: valid values are ['AUTOMATIC', 'MANUAL']

This is repetitive. I'll update the summary to be "Error updating cluster" like the rest of the errors in this file. It would be nice to eventually do an audit of the full provider and make a strategy for what to use for the summaries.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thanks!


if client.PlanType(state.Plan.ValueString()) == client.PLANTYPE_BASIC {
if client.PlanType(plan.Plan.ValueString()) != client.PLANTYPE_BASIC {
resp.Diagnostics.AddError("Error updating cluster", "plan and version must not be changed during the same plan.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two of the error messages say "... during the same plan." Does this mean ... must not be changed when plan` remains the same."? This was the only thing that stood out to me, I may not have the context here, and "during the same plan" is fine!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan I'm referring to is basically a batch of updates. Not to be confused with the cluster plan (i.e. basic, standard, advanced).

With terraform you make changes to your configuration, then you can run a "plan" in which terraform will spit out what the changes will be. Or you can run an "apply", which includes a "plan" but then asks if the plan looks good before proceeding with the actually update.

If this is a point of confusion I'm happy to change it. We could say something like "must not be changed at the same time" or "... at once" or, "please apply changes to the upgrade_type before upgrading", or many other things...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot about the plan command and the concept generally within Terraform. I think must not be changed at the same time, or maybe the original is fine. I kind of like plan and version must not be changed during the same terraform plan. But maybe that is too heavy handed. I'll leave it to you.

@fantapop fantapop force-pushed the CC-30640-allow-control-of-cluster-version-for-standard-clusters-in-terraform branch 2 times, most recently from 0b15bb6 to 4914afc Compare December 5, 2024 01:34
@fantapop fantapop requested a review from andy-kimball December 5, 2024 01:42
STANDARD clusters now accept the cockroach_version attribute on cluster
update for manual upgrades control.  The serverless.upgrade_type
attribute must be set to MANUAL.

As part of this an API call to validate cockraoch_version for ADVANCED
clusters prior to the update was removed.  This logic provided some
nicer error messages for advanced clusters but the logic was complex, it
required an additional api call to fetch the valid versions and standard
tier version checking is not supported.  We now rely on the serverside
validation for checking versions.
To clean up some of the testing, we add a method for building the
serverless test configuration which accepts a struct for providing the
values.
@fantapop fantapop force-pushed the CC-30640-allow-control-of-cluster-version-for-standard-clusters-in-terraform branch from d54a3d0 to 55d2b94 Compare December 5, 2024 16:53
Copy link
Contributor

@rgcase rgcase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

internal/provider/cluster_resource.go Show resolved Hide resolved
@fantapop fantapop merged commit ba8bd3c into main Dec 5, 2024
4 checks passed
@fantapop fantapop deleted the CC-30640-allow-control-of-cluster-version-for-standard-clusters-in-terraform branch December 5, 2024 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants