Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Available CRDs check feature (with script) #2741

Closed
wants to merge 1 commit into from

Conversation

raaizik
Copy link
Contributor

@raaizik raaizik commented Aug 8, 2024

Changes

Reasons for this enhancement:

  • A controller cannot set up a watch for a CRD that is not installed on the cluster, trying to set up a watch will panic the operator
  • There is no known way, that we are aware of, to add a watch later without client cache issue

How does the enhancement work around the issue:

  • On start up, detect which CRD are avail (out of a fixed list) and skip watches for ones that are not avail
  • At the start each reconcile iteration, revalidate which CRD are now available. If a CRD of interest is now avail, exit the op with a known exit code (42)
  • Have the pod command detect the exit code and if it is the known exist code (42), restart the process

This process will guarantee that the pod does restart when a new CRD of
interest becomes available. This in turn helps to avoid the following
issue:

  • Pod will not get into CrushLoopBackoff state
  • There will be no change that the pod becomes unschedulable after restart as of missing resources

Note

#2712 is considered as a potential enhancement by @iamniting

Reasons for this enhancement:
- A controller cannot set up a watch for a CRD that is not installed on
 the cluster, trying to set up a watch will panic the operator
- There is no known way, that we are aware of, to add a watch later
 without client cache issue

How does the enhancement work around the issue:
- On start up, detect which CRD are avail (out of a fixed list) and
 skip watches for ones that are not avail
- At the start each reconcile iteration, revalidate which CRD are now
 available. If a CRD of interest is now avail, exit the op with a known
  exit code (42)
- Have the pod command detect the exit code and if it is the known
 exist code (42), restart the process

This process will guarantee that the pod does restart when a new CRD of
 interest becomes available. This in turn helps to avoid the following
  issue:
- Pod will not get into CrushLoopBackoff state
- There will be no change that the pod becomes unschedulable after
 restart as of missing resources

Signed-off-by: raaizik <[email protected]>
Co-Authored-By: Rewant Soni <[email protected]>
Copy link
Contributor

openshift-ci bot commented Aug 8, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: raaizik
Once this PR has been reviewed and has the lgtm label, please assign blaineexe for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@raaizik
Copy link
Contributor Author

raaizik commented Aug 8, 2024

/cc @nb-ohad @umangachapagain

Copy link
Contributor

openshift-ci bot commented Aug 8, 2024

@raaizik: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ocs-operator-bundle-e2e-aws ba19b8b link true /test ocs-operator-bundle-e2e-aws

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@raaizik raaizik closed this Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant