Skip to content

ADR: CRD Versioning #450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
fhennig opened this issue Sep 13, 2023 · 2 comments
Open

ADR: CRD Versioning #450

fhennig opened this issue Sep 13, 2023 · 2 comments

Comments

@fhennig
Copy link
Contributor

fhennig commented Sep 13, 2023

Below are some prelimiary thoughts on the topic of CRD versioning, mostly taken from the on-site meeting on this topic. For an ADR, my suggestion is to gather requirements about how it should work from this Kubernetes docs page: Versions in CustomResourceDefinitions - Kubernetes Documentation. It looks like a comprehensive guide on how the versioning works.

Conversion webhooks & up- and downgrading

mutating webhooks are the core mechanism required for versioning CRDs. If an old resource is applied by the user, the webhhook will convert it into the current version. Likewise, if the users requests and older version (arbirary versions can be requested) then the webhook is also used for conversion. (Note from felix: Does that mean that two way conversion is absolutely a thing that needs to be implemented?)

When you read an object, you specify the version as part of the path. You can request an object at any version that is currently served. If you specify a version that is different from the object's stored version, Kubernetes returns the object to you at the version you requested, but the stored object is not changed on disk.

We cannot remove (mandatory) fields, because their content will be required when downgrading. This means we have to rename the fields (i.e. deprecated_oldField).

Only support upgrades for now (no downgrading)
Do not skip releases - always upgrade only one version up (v1 -> v2 -> v3 not v1 -> v3)

CRD size

CRD size is a problem: etcd and the kube API both have limits on how large objects are allowed to be.

copy & paste - To have multiple versions in our CRD, we need to keep the old rust struct around. this means copy pasting the rust struct for each verison. Not ideal ....

stabilize CRDs first to reduce conversion efforts? Would be nice beause it saves a lot of work. But CRD versioning is important now.

ADR thoughts

  • What are the decision drivers?
  • What could a minimally viable implementation look like?
@nightkr
Copy link
Member

nightkr commented Sep 14, 2023

mutating webhooks are the core mechanism required for versioning CRDs.

Conversion webhooks are technically a distinct thing, but yes the idea still applies.

If an old resource is applied by the user, the webhhook will convert it into the current version.

It's also used if the user requests a different version than what is currently stored in etcd.

We cannot remove (mandatory) fields, because their content will be required when downgrading. This means we have to rename the fields (i.e. deprecated_oldField).

Not just mandatory fields, all fields (that are still used) must survive arbitrary roundtripping (for example: object is currently v2, user GETs v1 (conversion v2->v1), changes something, then REPLACEs it (conversion v1->v2), no data must be lost here).

Depending on the use case, we could also inject bogus placeholders when downgrading.

CRD size - it's a problem ... why exactly?

etcd and the kube API both have limits on how large objects are allowed to be.

Only support upgrades for now (no downgrading)
Do not skip releases - always upgrade only one version up (v1 -> v2 -> v3 not v1 -> v3)

AFAIK this decision was made about the operators themselves, not the API versioning?

@fhennig
Copy link
Contributor Author

fhennig commented Sep 14, 2023

Thanks for the comments, I updated the ticket

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants