Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [TestLeaderElection] e2e: build a descheduler image and run the descheduler as a pod #1497

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

hsunwenfang
Copy link

[TODO] policy function like func tooManyRestartsPolicy()
[TODO] func initPluginRegistry()

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 15, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign damemi for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 15, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @hsunwenfang!

It looks like this is your first PR to kubernetes-sigs/descheduler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/descheduler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @hsunwenfang. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 15, 2024
@hsunwenfang
Copy link
Author

[ASK]
Hi @ingvagabund
I'm working on TestLeaderElection and is refering your commit for TestTooManyRestarts
ab467a5
Where L57 of e2e_toomanyrestarts_test.go the "sigs.k8s.io/descheduler/pkg/framework/plugins/removepodshavingtoomanyrestarts" is imported
While there is no corresponding folder for e2e_leaderelection_test.go
Should "sigs.k8s.io/descheduler/pkg/framework/plugins/defaultevictor" be used instead?
Or should there be another way to do it
Thanks!!

"sigs.k8s.io/descheduler/pkg/descheduler"
)

// Should use something like "sigs.k8s.io/descheduler/pkg/framework/plugins/removepodshavingtoomanyrestarts"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, in this case it would be "sigs.k8s.io/descheduler/pkg/framework/plugins/podlifetime".

@ingvagabund
Copy link
Contributor

ingvagabund commented Aug 15, 2024

While there is no corresponding folder for leaderelection

leaderelection does not have a corresponding plugin. It's a configuration of the whole descheduler. So first you need to create a podlifetime policy (similar to tooManyRestartsPolicy). And then create a descheduler configuration with the leader election enabled.

@ingvagabund
Copy link
Contributor

func initPluginRegistry()

No need to update initPluginRegistry as it already registers all the necessary plugins.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 15, 2024
},
{
Name: defaultevictor.PluginName,
Args: &defaultevictor.DefaultEvictorArgs{
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Args in &defaultevictor.DefaultEvictorArgs follows #1472
Which may not be the proper args here

@@ -124,10 +157,18 @@ func TestLeaderElection(t *testing.T) {
}
t.Logf("Removed kube-system/descheduler lease")

tc := struct {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use single tc here instead of the tests []struct in #1472
Not sure if it's better to use []struct, includes the cases of ns1 and ns2 then run
But I suppose ns1 and ns2 should belongs to the same test case

Copy link
Contributor

@ingvagabund ingvagabund Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's currently only a single test to be run. With two descheduler pods run in the same namespace (kube-system).

@hsunwenfang
Copy link
Author

hsunwenfang commented Aug 15, 2024

Hi @ingvagabund
Thanks you for all the instruction
I added a commit and comments, please help on commenting~
And I did the 'make test-e2e' locally
It seems to be failed at the first e2e_duplicatepods test
Should there be a PR to address that also?
+ go test sigs.k8s.io/descheduler/test/e2e/ -v === RUN TestRemoveDuplicates e2e_duplicatepods_test.go:47: Error during client creation with unable to build in cluster config: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined --- FAIL: TestRemoveDuplicates (0.00s) panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0xc8 pc=0x1d0d319]

Big thanks again!

@ingvagabund
Copy link
Contributor

The goal here is to create two descheduler Deployments: https://github.com/kubernetes-sigs/descheduler/blob/master/test/e2e/e2e_toomanyrestarts_test.go#L213. Each deployment with its policy in its own configmap: https://github.com/kubernetes-sigs/descheduler/blob/master/test/e2e/e2e_toomanyrestarts_test.go#L195. Both deployments running in the same namespace.

@hsunwenfang
Copy link
Author

hsunwenfang commented Aug 16, 2024

Hi @ingvagabund
Thank you for the explanation, allow me to clarify further.
So the goal here is to replace direct call of RunDeschedulerStrategies() with deployment.
Q1
Should the namespace for the descheduler deployments be 'kube-system' or "e2e-" + strings.ToLower(t.Name())?
Q2
And how is the descheduler servers in ns1 and ns2 interact with the deployments then?
Q3
https://github.com/kubernetes-sigs/descheduler/blob/master/test/e2e/e2e_toomanyrestarts_test.go#L186
Seems rs is a returned struct not used by codes afterwards. Should L186-191 be removed?

Thx!

@ingvagabund
Copy link
Contributor

ingvagabund commented Aug 18, 2024

Should the namespace for the descheduler deployments be 'kube-system' or "e2e-" + strings.ToLower(t.Name())?

kube-system

And how is the descheduler servers in ns1 and ns2 interact with the deployments then?

If you mean a descheduler deployment, then:

  • descheduler server for ns1 will get replaced with one descheduler deployment
  • descheduler server for ns2 will get replaced with another descheduler deployment

If you mean leaderelection deployments created through createDeployment:

https://github.com/kubernetes-sigs/descheduler/blob/master/test/e2e/e2e_toomanyrestarts_test.go#L186
Seems rs is a returned struct not used by codes afterwards. Should L186-191 be removed?

Correct. This is a leftover I forgot to remove.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 19, 2024
@hsunwenfang
Copy link
Author

hsunwenfang commented Aug 22, 2024

Hi @ingvagabund
Thanks for the help always
During testing, my minikube node failed to pull docker.io/library/descheduler
And My docker account seems not authorized as well

Trying to pull docker.io/library/descheduler:latest... Error: initializing source docker://descheduler:latest: reading manifest latest in docker.io/library/descheduler: errors: denied: requested access to the resource is denied unauthorized: authentication required

Plus, in
https://github.com/kubernetes-sigs/descheduler/blob/master/kubernetes/deployment/deployment.yaml#L22
the image is different
registry.k8s.io/descheduler/descheduler

And I expect run-e2e-tests.sh to use local built image instead of docker.io/library/descheduler:${IMAGE_TAG}
Which I thinks live on DockerHub.

How should I auth for this image? Or use the local built image with run-e2e-tests.sh, thx!
thx!

@ingvagabund
Copy link
Contributor

ingvagabund commented Aug 29, 2024

You can update DESCHEDULER_IMAGE to point to localhost/descheduler:latest. Something that could be probably mentioned in the comment in run-e2e-tests.sh. Plus, updating DESCHEDULER_IMAGE to something like export DESCHEDULER_IMAGE="${DESCHEDULER_IMAGE_REPOSITORY:-docker.io/library}/descheduler:${IMAGE_TAG}".

@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 20, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 19, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants