-
Notifications
You must be signed in to change notification settings - Fork 66
🌱 Add test-operator with custom controller #2070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2070 +/- ##
==========================================
+ Coverage 73.33% 73.43% +0.09%
==========================================
Files 77 77
Lines 7076 7076
==========================================
+ Hits 5189 5196 +7
+ Misses 1545 1540 -5
+ Partials 342 340 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
3fb8ef7
to
693a355
Compare
Signed-off-by: Per Goncalves da Silva <[email protected]>
Signed-off-by: Per Goncalves da Silva <[email protected]>
Signed-off-by: Per Goncalves da Silva <[email protected]>
@@ -246,11 +246,18 @@ test-unit: $(SETUP_ENVTEST) envtest-k8s-bins #HELP Run the unit tests | |||
$(UNIT_TEST_DIRS) \ | |||
-test.gocoverdir=$(COVERAGE_UNIT_DIR) | |||
|
|||
TEST_OPERATOR_CONTROLLERS_HOME=./testdata/images/controllers | |||
TEST_OPERATOR_CONTROLLERS=v1.0.0 v2.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this missing v1.3.0
? Or is it not needed because it's not actually built?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, this is just for building the controller binaries. The 1.3.0 version uses the 1.0.0 controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I did not understood it as well because
- We need to have the "operator"
- Then, we need run the commands from this project to generate the manager (it will be image)
- Then, we need generate the bundle for each case scenario (using SDK or OPM for example)
- Then, we need add those bundles to a catalogue
- Then, create a Catalog Source for those
The 1.3.0 version uses the 1.0.0 controller
we have a verion that we use the bin generated by another?
I did not understood it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to the currently existing e2e tests. The v1.3.0 bundle in the PR substitutes the v2.0.0 bundle in main [this] commit.
There's only a change in the included manifests (namely the configmap) between v1.0.0 and v1.3.0. I.e. v1.3.0 still uses the same controller as v1.0.0. This make target builds only the controllers. The bundle image itself is built in testdata/push - which is how it's currently done anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these tests even run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the Makefile, I don't think so. I wasn't too concerned though, they are fairly simplistic and the code will anyway be exercised by the e2es. Let me see if I can add them in the Makefile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added them here: https://github.com/operator-framework/operator-controller/pull/2070/files#diff-76ed074a9305c04054cdebb9e9aad2d818052b07091de1f20cad0bbac34ffb52R230
I don't want to pollute the o-c project's coverage numbers with test-operator unit tests. So, I've added a separate make target and CI unit test run step.
Though, I still don't know if its worth the CO2 emissions for something a) will rarely change b) will be exercised by the e2e tests anyway.
...data/images/bundles/test-operator/v2.0.0/internal/controller/testoperator_controller_test.go
Outdated
Show resolved
Hide resolved
testdata/images/bundles/test-operator/v2.0.0/internal/webhook/testoperator_webhook_test.go
Outdated
Show resolved
Hide resolved
if _, err := w.Write(fileBytes); err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we concerned about filesize here? In that we only read/write once per file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I get it. I wouldn't be too worried about the size of the files we're packing. They tend to be relatively small text files with the exception of the controller binaries, which are currently around 50MB. We can't pipe the bytes directly from reader to writer because we need the file size for the header. I think all of this code will go away soon anyway (given our convo with Joe).
.PHONE: test-test-operator | ||
test-test-operator: | ||
KUBEBUILDER_ASSETS="$(shell $(SETUP_ENVTEST) use -p path $(ENVTEST_VERSION) $(SETUP_ENVTEST_BIN_DIR_OVERRIDE))" \ | ||
CGO_ENABLED=1 go test -tags '$(GO_BUILD_TAGS)' -count=1 -race -short ./testdata/images/bundles/test-operator/... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey! 👋
I think we should mock the scenarios just like our users would experience them. This means:
- The flow to generate a project and a bundle should mirror what a real user does. ( create project with SDK or Kubebuilder, inject the code, run the commands and generate a BUNDLE )
- We should have a script that automates those steps, so it's repeatable.
Therefore, we can keep our samples well-maintained.
Now, doing all this at once might be a bit much, so I’m totally fine starting with fixed samples, and then we can evolve toward full automation in follow-ups and discuss all properly.
However: We need to change the file structure:
We can’t organize it as bundles/test-operator | versions
because:
- We don’t have bundles — we have the actual projects.
- The current path is strange — why are we putting projects under
images/
? ⚠️ The directory name matters — tools like Kubebuilder or the SDK treat that as the project name. So if we usev1.0.0
as the folder name, automation breaks.
Proposal
Could we change it here to be like:
testdata/operators/
├── v1.0.0/
│ └── test-operator/
├── v1.3.0/
│ └── test-operator/
└── v2.0.0/
└── test-operator/
This way:
- Each version gets its own isolated test setup.
- The project directory stays named
test-operator
, so tooling works as expected. - It’s clean, easy to automate, and mirrors user behavior more closely.
Let me know what you think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By following my suggestion here for we have a real Operator that can be maintained via automation #2070 (comment), could we please use a name like memcached-operator instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda get what you're trying to do, but I don't think that's scalable or maintainable given the necessary controller, webhook, and crd code that will be required by the test-operators (also between versions of it), which will either require patching or copying over generated code, which can be brittle, error prone, and create problems that are annoying to debug and fix - the only benefit being that we'll know when the tools no longer work for building the fixtures.
The point of the test-operator isn't to exercise the steps involved in creating an operator and to make sure that works with OLMv1. We already have a test that exercises this developer flow. The point is to have a small, simple, test-operator which we own and can be used to exercise different features through out our e2e tests.
I'd prefer not to call it memcached-operator since it isn't. What's wrong with test-operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The raised problem here is with the path that does not allow we have those samples well maintained, well documented and well generated in the long term, and the names does not reflect to the reality, could we
testdata/operators/
├── v1.0.0/
│ └── test-operator/
├── v1.3.0/
│ └── test-operator/
└── v2.0.0/
└── test-operator/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why they can't be well maintained. It seems more natural to group the versions under the operator than the operator under the versions...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @perdasilva
Great work 🥇
Following my main concerns/suggestions:
-
- Could we please ensure that the paths are better organised and allow automation: https://github.com/operator-framework/operator-controller/pull/2070/files#r2194412567 before we merge this one?
-
- Could we please use the tool to generate a mock sample so that we have a functional Operator with a real-world case that we can expand upon? By utilising the tools, we can automate many aspects and help ensure that the solution is well-maintained in the long term. See; #2070 (comment)
.github/workflows/unit-test.yaml
Outdated
@@ -23,6 +23,10 @@ jobs: | |||
run: | | |||
make test-unit | |||
|
|||
- name: Run test-operator unit tests | |||
run: | | |||
make test-test-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should introduce a new GitHub Action, something like:
test-testdata / test-samples
This job would be responsible for running tests that validate our test samples.
Why this helps:
- Keeps test ownership clear — each job has a focused responsibility.
- Keeps tests running in parallel, so we don’t slow down CI.
- When something fails, it's easier to identify what broke and why — especially if the failure is specific to testdata samples.
Also, we would have a matrix to run all tests for all samples, calling the make test scaffold for the sample itself see examples:
- To run the lint: https://github.com/kubernetes-sigs/kubebuilder/blob/master/.github/workflows/lint-sample.yml
- To run the e2e tests: https://github.com/kubernetes-sigs/kubebuilder/blob/master/.github/workflows/test-e2e-book.yml
If we do as our Operator Authors, and use the tool to properly generate the samples we will have have a Makefile in the sample that has the targets to run the tests. See, example: https://github.com/kubernetes-sigs/kubebuilder/blob/master/testdata/project-v4-with-plugins/Makefile#L60-L63
So, the github action to verify if samples are OK would call those directly and we should not created those in the Makefile of the project operator-controller itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved the test over to another job. Don't know that it will be faster since it needs to be scheduled, cloned, go modded, etc. again...
Makefile
Outdated
@@ -246,11 +246,24 @@ test-unit: $(SETUP_ENVTEST) envtest-k8s-bins #HELP Run the unit tests | |||
$(UNIT_TEST_DIRS) \ | |||
-test.gocoverdir=$(COVERAGE_UNIT_DIR) | |||
|
|||
|
|||
.PHONE: test-test-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO:
We should not have it
Instead we should have a real mock generated with the tools
and this mock will have a Makefile
And we call the targets available in the mock directly.
See, example: https://github.com/kubernetes-sigs/kubebuilder/blob/master/testdata/project-v4-with-plugins/Makefile#L60-L63
@@ -0,0 +1,137 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's build the samples with the tooling and have a more real case scenario that can be automatically well-maintained instead. Otherwise it will become a burn to keep those maintained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kinds of changes do you foresee? I think the only thing I don't really like is that we don't generate the manifest automatically. I think this is something we can overcome in a follow-up. I'd prefer not to bootstrap entire operator projects as part of fixture building though.
.github/workflows/unit-test.yaml
Outdated
- name: Run test-operator unit tests | ||
run: | | ||
make test-test-operator | ||
|
||
- uses: codecov/[email protected] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please use the tools to generate the mocks?
Note that we have a plugin that generates a project with controller implementation, controller tests, which are a sample/mock, much better than this test operator, which has zero implementation.
Following the steps:
- operator-sdk | kubebuilder init
- operator-sdk | kubebuilder create api --group example.com --version v1alpha1 --kind Memcached --image=memcached:1.6.26-alpine3.19 --image-container-command="memcached,--memory-limit=64,-o,modern,-v" --image-container-port="11211" --run-as-user="1001" --plugins="deploy-image/v1-alpha"
- make all
That will generate a FULL example.
- with a controller implemented with: https://github.com/kubernetes-sigs/kubebuilder/blob/master/testdata/project-v4-with-plugins/internal/controller/memcached_controller.go
- with tests implemented for this controller: https://github.com/kubernetes-sigs/kubebuilder/blob/master/testdata/project-v4-with-plugins/internal/controller/memcached_controller_test.go
- Also, the mock will have e2e tests that can be called in the github actions: https://github.com/kubernetes-sigs/kubebuilder/tree/master/testdata/project-v4-with-plugins/test/e2e to validate the sample if we wish so
However, if we use kubebuilder to do it, Kubuilder already validate the mocks, so we would not need to have an action to test them out; we could skip that.
Furthermore, in the future, we can indeed mock webhooks with the tooling, and a conversion, for example, is almost 100% scaffolded with the tool. Example
# Create API to check webhook --conversion from v1 to v2
$kubebuilder create api --group example.com --version v1 --kind Wordpress --controller=true --resource=true
$kubebuilder create api --group example.com --version v2 --kind Wordpress --controller=false --resource=true
$kubebuilder create webhook --group example.com --version v1 --kind Wordpress --conversion --spoke v2
The above will result in all scaffolds doing the conversion webhook with the spoke approach.
(I would suggest we use kubebuilder and SDK to do the bundle only as the users have been doing after SDK deprecation) In the future, we replace SDK to use KPM ( new tooling to do bundle only ) and we have lower maintainance burn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I answered this in the previous comment, but I'll re-iterate. The point of the test-operator isn't to exercise the tools required to generate an operator. I think this approach works for simple cases, but as soon as you have custom code: reconciler implementations, webhook implementations, changes to CRDs it will become brittle and horrible to maintain as it inevitably will require either code overwriting or patching. Also, given that we already have a test that exercises the dev workflow, I don't know what this buys us...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also rather not use the memcached implementation since part of the exercise it to use something we own and have control over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal here is not to just exercise tooling at all.
The idea is to create a sample in the same way users would — with real-case scenarios —
and ensure they are maintainable over time via automation.
This approach allows us to:
- Keep samples well maintained through automation
- Clearly show what changed from one version to another
- Easily understand the generation process by reading the code
- Update samples quickly and at scale (without manual work)
We know these samples will need updates. Doing that manually doesn't scale.
With automation, we can:
- Make changes safely and consistently
- Always retain control over the output
- Keep every change explicit — both what changed and why
Example By looking at the automation, we can see:
- What code do we inject ( with a description of why )
- What commands were called ( we create webhooks and etc with the tooling, save time, save the burn to manually maintain, ensure that we are doing as the huge majority of our users )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to go down this path. So, I'll close this and focus on something else.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks ^^ I'd rather not make the developer workflow something that's in the critical path of creating our fixtures. Part of the motivation was to have control over the operator (so, I'd rather not use memcached). The point of the fixtures is the test olm and not the developer workflow + commingling custom and generated code is brittle. So, I'd rather not pursue this path if we can avoid it. |
Signed-off-by: Per Goncalves da Silva <[email protected]>
@@ -246,11 +246,24 @@ test-unit: $(SETUP_ENVTEST) envtest-k8s-bins #HELP Run the unit tests | |||
$(UNIT_TEST_DIRS) \ | |||
-test.gocoverdir=$(COVERAGE_UNIT_DIR) | |||
|
|||
|
|||
.PHONY: test-test-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a function of the Operator/Mock sample itself.
It should not be part of the operator-controller
Makefile at all — it belongs inside the sample.
Placing it in the main Makefile goes against important separation-of-concerns principles.
It introduces coupling that will make these components harder to maintain over time.
Keeping this logic self-contained in the sample is key to preserving clarity and long-term maintainability.
@@ -1,10 +1,6 @@ | |||
annotations: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the bundle generated?
What happens when we change the mock sample?
Shouldn't the bundle be re-generated to reflect those changes?
This is exactly why we need automation in place — to ensure that:
- When we push a PR, the samples are properly updated
- The bundle reflects those changes accurately
- We avoid manual steps that are error-prone and hard to maintain
Without automation, it's too easy for samples and bundles to get out of sync.
Depends on #2080
Description
Adds the scaffolding for the test-operator in versions v1 and v2 using operator-sdk. The test-operator is a basic echo service.
The test-operator supports AllNamespaces, SingleNamespace, and OwnNamespace install modes.
v1.0.0: is just the crd and the existing configmap resource
v1.3.0: has the same crd as v1.0.0 and the updated configmap previously part of v2.0.0
v2.0.0: has validating, mutating and conversion webhooks
I've had to modify an e2e that expected v2.0.0 of the test operator to be installed since v2 now includes webhooks and support is not enabled by default (yet). I've added a v1.3.0 as a stand-in for the previous v2 and updated the catalogs accordingly.
Because the test-operator now has its own controller, I've also had to update testdata/push to handle the controller image building/pushing
As it is, the test-operator should make a good fixture for testing the Single/OwnNamespace and Webhook support features.
Thing we should feel confident in doing in the future
Risks
v1
A reconciled test-operator v1 resource would look like this:
The v1 operator also includes a validating and mutating webhook. The validating webhook validates
.spec.message
against a keyword and rejects admission if its found. The mutating webhook is used as a "defaulting" webhook that sets.spec.message
to the default message ofEcho
.v2
A reconciled test-operator v2 resource would look like this:
The same webhooks are available for v2 apis + a conversion webhook to support the CRD upgrade.
Reviewer Checklist