Skip to content

v1-to-v3 CRD migration controller to enable API server removal#12012

Merged
coutinhop merged 101 commits intoprojectcalico:masterfrom
caseydavenport:caseydavenport/datastore-migration-controller-prototype
Mar 19, 2026
Merged

v1-to-v3 CRD migration controller to enable API server removal#12012
coutinhop merged 101 commits intoprojectcalico:masterfrom
caseydavenport:caseydavenport/datastore-migration-controller-prototype

Conversation

@caseydavenport
Copy link
Member

@caseydavenport caseydavenport commented Mar 6, 2026

Controller that migrates Calico resources from crd.projectcalico.org/v1 CRDs to projectcalico.org/v3 CRDs, enabling clusters running the aggregated API server to switch to native v3 CRDs without downtime.

Description

The migration controller lives in kube-controllers and is driven by a DatastoreMigration custom resource (migration.projectcalico.org/v1beta1). It implements a state machine:

Pending → Migrating → (WaitingForConflictResolution →) Converged → Complete (or Failed)

During migration it:

  • Validates prerequisites and waits for migration RBAC (operator creates this dynamically)
  • Saves and deletes the aggregated APIService so projectcalico.org/v3 routes to CRDs
  • Locks the v1 datastore (DatastoreReady=false) so components hold cached state
  • Copies all 21 OSS resource types from v1 → v3 using a typed registry pattern
  • Detects conflicts (v3 resources with different specs) and pauses for manual resolution
  • Remaps OwnerReference UIDs for cross-resource references
  • Unlocks v3 datastore at Converged; v1 stays locked to prevent IPAM leaks during rollout
  • Waits for calico-node to roll out with CALICO_API_GROUP=projectcalico.org/v3 before Complete
  • Shows progress via status message column (kubectl get datastoremigration)
  • On CR deletion post-completion: cleans up all v1 CRDs via finalizer
  • On CR deletion pre-completion (abort): restores APIService from saved annotation

Also includes:

  • Bounded worker pool for concurrent API writes
  • Retry with exponential backoff for transient API errors
  • Prometheus metrics (resource counts, type durations, retries, phase tracking)
  • Per-type progress reporting with CRD print columns
  • Operator helm chart: RBAC for DatastoreMigration CRs, regenerated manifests
  • dev-image build fixes (duplicate targets, REPO passthrough, PullAlways)
  • Comprehensive IPAM block/handle migration tests
  • End-to-end test script for full migration lifecycle

Companion PR:

New projectcalico.org/v3 CRD type for tracking v1-to-v3 CRD migration.
Includes spec (migration type), status (phase, progress, conditions),
and printer columns for kubectl output.
Migration controller in kube-controllers that drives the v1-to-v3 CRD
migration state machine. Watches for DatastoreMigration CRs and copies
resources from crd.projectcalico.org/v1 to projectcalico.org/v3 CRDs.

Uses an extensible registry pattern so enterprise types can be added via
separate init() functions. Reads v1 resources through the libcalico-go
backend client (gets metadata unpacking for free) and writes v3 resources
via the typed clientset.

Handles policy name migration (default. prefix removal), conflict
detection, and the ClusterInformation lock/unlock sequence.
@caseydavenport caseydavenport added docs-not-required Docs not required for this change release-note-not-required Change has no user-facing impact labels Mar 6, 2026
@marvin-tigera marvin-tigera added this to the Calico v3.32.0 milestone Mar 6, 2026
Move kind cluster deploy scripts, image loading, and infra files from
node/tests/k8st/ into hack/test/kind/ so they can be reused from any
component without depending on the node directory.

Also speeds up image loading by replacing 14 individual docker save +
kind load image-archive calls with a single combined tar - one docker
save of all test-build images, one kind load. Build rules now use
lightweight stamp files instead of producing individual .tar artifacts.
Compare local Docker image IDs against what's already on the cluster
via crictl and only save/load images that have actually changed. On
incremental rebuilds where only one or two components changed, this
cuts image loading from ~54s to ~2s.

Also plumb KIND_NAME through to the load and deploy scripts so kind
commands target the correct cluster by name.
- Rename kind-k8st-setup to kind-setup since it's not k8st-specific
- Move image stamp rules and kind-test-images from node/Makefile into
  lib.Makefile so any component can build test images without depending
  on node/
- Fix circular .stamp.operator dependency (depended on itself via
  K8ST_IMAGE_STAMPS) using $(filter-out)
- Remove duplicate load-container-images target from node/Makefile
  (now only in lib.Makefile)
- Add .PHONY for kind-test-cluster in root Makefile
- Delete old node/tests/k8st/ scripts and infra/ that are now fully
  superseded by hack/test/kind/
The test-webserver deployment, services, and client pod were deployed
during cluster setup but not referenced by any test code. Remove them
along with the connectivity check that depended on them.
Add back explanatory comments that were accidentally dropped when
moving the deploy script. Also fix the kind-setup target comment to
accurately describe that it creates a cluster if one doesn't exist.
Rename targets to form a clear, consistent kind- namespace:

  kind-up            - build images + create cluster + deploy Calico
  kind-down          - tear down the cluster (alias for kind-cluster-destroy)
  kind-build-images  - build and tag all component images
  kind-deploy        - create cluster + deploy Calico (assumes images exist)
  kind-reload        - load changed images + restart pods (for iterating)
  kind-cluster-create  - create bare kind cluster + CRDs
  kind-cluster-destroy - tear down the cluster

Remove the kind-k8st-cleanup alias that just forwarded to
kind-cluster-destroy.
Shorten script names to deploy_resources.sh and load_images.sh. Make
VALUES_FILE overridable via env var so enterprise can point to its own
helm values file without modifying the script.
The chart rules in the root Makefile use relative paths (./bin/...) but
kind-deploy in lib.Makefile referenced them via $(REPO_ROOT)/bin/...
which Make treats as a different target string. When kind-deploy was
invoked from a subdirectory (e.g., node/Makefile), Make couldn't find
a rule to build the charts.

Fix by building charts as a recipe step via $(MAKE) -C $(REPO_ROOT) chart
instead of listing them as prerequisites.
Define KIND_IMAGES in lib.Makefile as the single source of truth for
which images to load onto the kind cluster. Pass the list to
load_images.sh as positional arguments instead of maintaining a
duplicate list inside the script.
The kind cluster was previously set up with v3 CRDs via node/Makefile's
override. Now that the kind targets live in lib.Makefile, update the
lib.Makefile default to match.
The kind cluster was previously set up with v3 CRDs via node/Makefile's
override. Now that the kind targets are invoked from the root Makefile,
set the default there too (before including lib.Makefile).
The refactor-kind-cluster branch moved kind-up to the root Makefile,
but node/Makefile's k8s-test target still called it locally.
Implements a two-pass migration approach: the first pass creates all v3
resources and builds a v1→v3 UID mapping, the second pass remaps
OwnerReference UIDs on migrated resources that point to other Calico
resources. Non-Calico OwnerReferences (e.g., to Namespaces) are left
unchanged.

Also moves the DatastoreMigration API type out of the public api/ module
into kube-controllers/pkg/controllers/migration/, since this is an
internal prototype type managed via the dynamic client. Fixes the wrapper
binary Makefile target to rebuild on source changes.
Adds a kind-based migration test that:
- Deploys connectivity workloads (client/server HTTP probes) before
  migration and verifies zero packet loss throughout
- Seeds Calico resources with OwnerReferences to both Calico resources
  (Tier) and native K8s resources (Namespace)
- Verifies OwnerReference UIDs are correctly remapped for Calico owners
  and preserved for non-Calico owners after migration
- Checks all seeded resources are correctly migrated from v1 to v3

Run via `make kind-migration-test` or manually with run_test.sh on an
existing v1-mode kind cluster.
The GIT_DIR and GIT_WORK_TREE env vars set by lib.Makefile for worktree
support leak into the libbpf git commands, causing "working tree already
exists" errors. Unset them for the clone/fetch/checkout operations.
IPAMBlock and IPAMHandle use BlockListOptions and IPAMHandleListOptions
respectively in the libcalico-go backend, rather than the generic
ResourceListOptions that every other resource type uses. This means they
can't use listV1Resources directly.

Add listV1IPAMBlocks and listV1IPAMHandles helper functions that list via
the correct list options, then convert the returned model types
(model.AllocationBlock, model.IPAMHandle) back to v3 API types wrapped
in KVPairs with ResourceKey keys. This lets the standard
MigrateResourceType function handle them without modification.
Replace the sequential create loop in MigrateResourceType with a
two-phase approach: list and convert v1 resources sequentially, then fan
out create/check/conflict operations to a bounded worker pool of 10
goroutines. Results are collected via a channel and aggregated after all
workers finish.

This reduces the locked migration window (DatastoreReady=false) by
allowing multiple v3 resources to be created in parallel against the API
server, which is particularly important for high-count types like
IPAMBlock.
Implement finalizer-based abort/rollback for the migration controller:
- Save APIService to annotation before deletion, restore on abort
- Add finalizer to DatastoreMigration CR for cleanup on deletion
- Post-completion deletion triggers v1 CRD cleanup
- Pre-completion deletion triggers abort (restore APIService, unlock datastore)

Update the test script with disruption and cleanup tests:
- Step 6: Force-kill kube-controllers mid-migration, verify recovery
- Step 10: Delete DatastoreMigration CR, verify v1 CRD cleanup
Report progress after each resource type completes, including:
- totalTypes/completedTypes/currentType for high-level tracking
- typeDetails array with per-kind migrated/skipped/conflicts counts
- CRD printcolumns: Phase, Types Done, Types Total, Current Type,
  Migrated, Skipped (priority=1), Conflicts, Age

Fix updateStatus to refresh the in-memory object from the server
response, avoiding stale resourceVersion on successive updates.
Add DeleteV3 to ResourceMigrator and implement it for all 21 OSS
resource types. The cleanupPartialV3Resources method now actually
deletes v3 resources created during migration when aborting, rather
than just logging. Failures are best-effort and don't block the abort.
Wrap GetV3 and CreateV3 calls in migrateOneResource with
wait.ExponentialBackoffWithContext to handle transient errors like
server timeouts, throttling (429), and service unavailable (503).
Retries up to 5 times with 200ms initial delay, 2x factor, 10s cap.

Fix data race in test mock's CreateV3 callback (concurrent slice append).
New metrics:
- calico_migration_resources_total{kind,outcome}: counter of resources
  processed, labeled migrated/skipped/conflict per resource kind
- calico_migration_resource_errors_total{kind}: counter of fatal errors
- calico_migration_retries_total{kind,operation}: counter of retried
  API calls (get/create) per resource kind
- calico_migration_phase{phase}: gauge (1/0) for current phase
- calico_migration_duration_seconds: histogram of total migration time
- calico_migration_type_duration_seconds{kind}: histogram of per-type
  migration time
Add kubebuilder markers to the DatastoreMigration types so the CRD
YAML is generated by controller-gen rather than hand-maintained.

- Add +kubebuilder:object:root, +kubebuilder:resource, +kubebuilder:subresource,
  +kubebuilder:printcolumn, and +kubebuilder:validation markers to api.go
- Add DatastoreMigrationList type (required by controller-gen)
- Add groupversion_info.go with +groupName marker
- Add gen.go with go:generate directives for controller-gen + version fixup
- Regenerate the CRD YAML from the markers
Replace the inline pre-check loop in handlePending and the per-migrator
CheckConflicts calls in handleWaiting with a single DetectConflicts
function that checks all migrators for v1/v3 spec mismatches.

Conflicts detected during pre-validation now transition to
WaitingForConflictResolution immediately, before locking the datastore
or touching any resources. Resolution transitions back to Pending for
re-validation rather than directly to Migrating.
Add a public BackendAccessor interface to libcalico-go/lib/backend/api
so callers don't need to define local "accessor" interfaces to get the
backend client. Use it in main.go.

Block deletion of the DatastoreMigration CR when in Converged phase —
the operator may have already started rolling out pods with v3 mode, so
aborting is unsafe. The finalizer holds the CR in terminating state
until migration completes naturally.

Add a comment clarifying that the apiserver does not recreate CRDs
after deletion.
Replace the ResourceMigrator struct (with function fields) with a
ResourceMigrator interface backed by a generic implementation in a new
migrators sub-package. The interface embeds both the backend and
controller-runtime clients so callers don't pass them at every call site.

A defaultConvert function handles the common case (deep copy + clean
metadata), and WithConvert/WithListOptions options handle the special
cases (policy name migration, IPAM types). The 21 verbose Register()
calls in resources.go collapse to one-liners for common types.

MigrateResourceType, DetectConflicts, and RemapOwnerReferences no longer
take client parameters — they call interface methods directly.
@caseydavenport caseydavenport force-pushed the caseydavenport/datastore-migration-controller-prototype branch from 168f747 to eaafbeb Compare March 18, 2026 03:05
Replace the old unit tests (fake clients, manual reconcile calls) with a
single envtest-based integration test that drives the full migration
lifecycle against a real kube-apiserver. The test uses a phaseGate
interceptor to pause the controller at each phase transition, giving the
test deterministic control over the async reconcile loop.

Key changes alongside the test:

- Eliminate global migrator registry: migrators are now passed via
  ControllerConfig.Migrators instead of a global Register/GetRegistry
  pattern. NewMigrators() returns the full OSS migrator set.

- Fix missing DaemonSet watch: add a DaemonSet informer to the
  controller so changes to calico-node trigger a reconcile during the
  Converged phase, instead of waiting for the 60s informer resync.

- Fix migration scheme registration: add metav1.AddToGroupVersion so
  the DatastoreMigration types work with controller-runtime's typed
  client (required for envtest).

- Use fake apiregistration client for APIService operations in tests,
  avoiding envtest's auto-recreation of automanaged APIServices.
@caseydavenport caseydavenport force-pushed the caseydavenport/datastore-migration-controller-prototype branch from eaafbeb to bf42633 Compare March 18, 2026 03:08
caseydavenport and others added 14 commits March 17, 2026 20:16
The RemapOwnerReferences pass was listing all 22 v3 resource types from
the API server to find objects with Calico OwnerRefs. This took ~19s
in tests (and would be worse in production with many resources).

Two changes:

- Collect objects with Calico OwnerRefs during the migration pass
  instead of re-listing afterward. RemapOwnerReferences now takes the
  pre-collected objects directly and an update function, eliminating
  all ListV3 API calls from the remapping path.

- Remove the explicit read-back after CreateV3. controller-runtime's
  Create deserializes the server response into the object in-place
  (including UID), so the separate GetV3 round-trip was redundant.

Test time drops from ~21s to ~2s.
The v1 backend already restores the original v3 policy name from the
projectcalico.org/metadata annotation when reading back resources. The
migration converters were double-converting by calling
migratedPolicyName to strip the default. prefix again.

Remove the custom convert functions for GlobalNetworkPolicy,
NetworkPolicy, StagedGlobalNetworkPolicy, and StagedNetworkPolicy —
they all use the default deep-copy-and-clean converter now.
Add TestLifecycle_ConflictResolution: pre-creates a v3 Tier with a
different spec than v1, verifies the controller enters
WaitingForConflictResolution with the conflict reported in conditions,
then fixes the conflict and verifies the controller proceeds through to
Complete.

Controller changes:
- Add requeueAfter mechanism so reconcile handlers can request a delayed
  requeue without logging an error. handleWaiting uses this to poll for
  conflict resolution.
- Make WaitingPollInterval configurable via ControllerConfig (defaults
  to 10s, tests use 500ms).

Test infrastructure:
- Add startController and createMigrationCR helpers to reduce
  boilerplate across tests.
- Add cleanupMigrationResources to properly tear down between tests
  (strips finalizer, deletes v3 resources).
- fvHelper now carries *testing.T for cleanup registration.
Add TestLifecycle_Rollback: runs migration to completion, then triggers
abort by setting the phase back to Migrating and deleting the CR.
Verifies the abort path restores the APIService, unlocks v1
ClusterInformation, cleans up migrated v3 resources, and removes the
finalizer so the CR is garbage collected.

Also refactors test infrastructure:
- startController helper returns the fake apiregistration client for
  APIService assertions.
- createMigrationCR helper with cleanup that strips the finalizer.
- dmKey package-level var for the well-known CR name.
Add TestLifecycle_DeletionBlockedThenCompleted: deletes the CR while in
Converged (verifies rollback is blocked), then creates the calico-node
DaemonSet so the migration completes, and verifies the finalizer runs
the completed cleanup path and the CR is garbage collected.

Fix a bug where deleting the CR from Converged phase caused the
controller to loop on the "cannot abort" message without ever
progressing to Complete. handleDeletion now falls through to
handleConverged when the phase is Converged, allowing the migration to
finish and the completed cleanup to run.
Four tests using a fake apiextensions client and a mock
ContextController:

- StartsWhenCRDEstablished: inner controller doesn't start until the
  watched CRD has the Established condition.
- StopsWhenCRDDeleted: deleting the CRD cancels the inner controller's
  context.
- RestartsWhenCRDRecreated: inner controller restarts after a
  delete/recreate cycle.
- IgnoresOtherCRDs: CRDs with different names don't trigger the inner
  controller.
convert_test.go (migrators package):
- Clears server-side metadata (ResourceVersion, Generation, etc.)
- Preserves UID and OwnerReferences for migration mapping
- Filters projectcalico.org/metadata annotation, keeps custom ones
- Nils annotations map when only internal annotation present
- Deep copies spec (mutation doesn't affect original)
- Returns error for wrong input type

remap_test.go (migration package):
- Remaps Calico OwnerRef UIDs using the v1→v3 map
- Preserves non-Calico OwnerRefs (e.g., Namespace) untouched
- Skips objects when no mapping exists for their OwnerRef UIDs
- Propagates update errors
- No-ops on empty UID map or empty object list
- Table test for isCalicoAPIGroup edge cases
- handleConverged: return requeueAfter instead of nil when DaemonSet
  Get fails, so the controller retries on a bounded schedule instead of
  relying solely on informer events.

- DeferredCRDController: fix data race on innerCancel by replacing
  direct mutation from informer/goroutine callbacks with channel-based
  signaling (deletedCh, stoppedCh). All innerCancel reads/writes now
  happen in the single select loop. Use chanutil.WriteNonBlocking for
  non-blocking channel sends.

- setPhaseMetric: add WaitingForConflictResolution to the phase gauge
  so the metric doesn't show all zeros in that phase.

- handleWaiting: update status message and conditions with current
  conflict list on each poll, so kubectl output stays fresh.

- Remove ClusterInformation from NewMigrators since the controller
  manages it directly via lockDatastore/unlockV3CRDDatastore.

- SpecsEqual: log a warning when the Spec field is not found on an
  object and the comparison falls back to full DeepEqual.

- ConflictInfo: add Namespace field so namespaced resource conflicts
  (e.g., NetworkPolicy) include the namespace in the message.

- defaultConvert: clear SelfLink along with other server-side fields.

- IsOperatorManaged: return an error instead of swallowing it, so
  transient failures (RBAC, network) cause a retry instead of silently
  assuming manifest mode.
Use NewClientset instead of deprecated NewSimpleClientset for the
apiextensions fake client. Suppress the deprecation warning for the
kube-aggregator fake client where NewClientset isn't available yet.
Fix ST1023 redundant type declaration.
The build was pointing at my fork's operator branch which doesn't
exist on the upstream remote, breaking e2e test setup.
…tests

NewClientset() in k8s 1.35 ships with an empty SMD schema for CRD types,
causing Create() to fail with "no type found matching". Use the deprecated
NewSimpleClientset() until the upstream apiextensions-apiserver module is
fixed.
The migration controller FV tests use envtest.Environment which requires
etcd and kube-apiserver binaries. Add setup-envtest as a prerequisite
and pass KUBEBUILDER_ASSETS to run-uts so the test binary can find them.
The migration FV tests used a hardcoded relative path (const repoRoot =
"../../../..") to locate CRD directories for envtest. This only works
when the test binary CWD matches the source package directory, which
isn't the case when run-uts executes pre-compiled binaries from the
kube-controllers/ directory.

Add a FindRepoRoot() function in libcalico-go/lib/testutils that walks
up from CWD to find the monorepo root via go.mod, and use it in both
the migration FV tests and the validation tests.
Add unit tests for convertIPAMBlock and convertIPAMHandle covering the
mainline conversion path, nil UID handling, and wrong-type error cases.

Add TestLifecycle_BackendListError to verify the controller retries and
recovers when the v1 backend returns a transient error during migration.
Add a mutex to mockBackendClient so the test can safely read listCounts
and clear listErrors from a separate goroutine.
…ests

Set up logrus with the calico Formatter and DebugLevel in the migration
FV tests for consistent log output. Bump envtest REST config QPS/Burst
to avoid throttling under concurrent test load.

Suppress spurious "client rate limiter" errors that were logged when the
controller context is canceled during test cleanup — check ctx.Err()
before logging reconcile errors.
@coutinhop coutinhop merged commit df54aa5 into projectcalico:master Mar 19, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required Docs not required for this change release-note-not-required Change has no user-facing impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants