Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPNET-569: Do not run resolv-prepender from NM dispatcher #4654

Merged
merged 1 commit into from
Oct 29, 2024

Conversation

mkowalski
Copy link
Contributor

@mkowalski mkowalski commented Oct 22, 2024

This PR changes the logic of how NetworkManager communicates changes in the environment and how they are picked by on-prem-resolv-prepender.

Previously the NM dispatcher script had a logic that would trigger a systemd service (either to start it or to restart). This proven to be problematic, prone to race conditions and in principle a complicated design.

Now we are moving to a model where systemd on its own will decide what and when to restart, in our case by leveraging the systemd path units.

NM dispatcher is now responsible only for writing a correct environment file (we need that in case DNS search domains change). Systemd path unit observes the file and if a change is detected, it will trigger whatever is necessary.

There is other stuff in the NM dispatcher script that we are keeping as it is out of scope of this refactor.

Closes: OPNET-569

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 22, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 22, 2024

@mkowalski: This pull request references OPNET-569 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR changes the logic of how NetworkManager communicates changes in the environment and how they are picked by on-prem-resolv-prepender.

Previously the NM dispatcher script had a logic that would trigger a systemd service (either to start it or to restart). This proven to be problematic, prone to race conditions and in principle a complicated design.

Now we are moving to a model where systemd on its own will decide what and when to restart, in our case by leveraging the systemd path units.

NM dispatcher is now responsible only for writing a correct environment file (we need that in case DNS search domains change). Systemd path unit observes the file and if a change is detected, it will trigger whatever is necessary.

There is other stuff in the NM dispatcher script that we are keeping as it is out of scope of this refactor.

Closes: OPNET-569

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mkowalski
Copy link
Contributor Author

/test e2e-metal-ipi

Copy link
Contributor

openshift-ci bot commented Oct 22, 2024

@mkowalski: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test 4.12-upgrade-from-stable-4.11-images
  • /test cluster-bootimages
  • /test e2e-aws-ovn
  • /test e2e-aws-ovn-upgrade
  • /test e2e-gcp-op
  • /test e2e-gcp-op-single-node
  • /test e2e-hypershift
  • /test images
  • /test unit
  • /test verify

The following commands are available to trigger optional jobs:

  • /test 4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade
  • /test bootstrap-unit
  • /test e2e-aws-disruptive
  • /test e2e-aws-ovn-fips
  • /test e2e-aws-ovn-fips-op
  • /test e2e-aws-ovn-upgrade-out-of-change
  • /test e2e-aws-ovn-workers-rhel8
  • /test e2e-aws-proxy
  • /test e2e-aws-serial
  • /test e2e-aws-single-node
  • /test e2e-aws-upgrade-single-node
  • /test e2e-aws-workers-rhel8
  • /test e2e-azure
  • /test e2e-azure-ovn-upgrade
  • /test e2e-azure-ovn-upgrade-out-of-change
  • /test e2e-azure-upgrade
  • /test e2e-gcp-op-techpreview
  • /test e2e-gcp-ovn-rt-upgrade
  • /test e2e-gcp-rt
  • /test e2e-gcp-rt-op
  • /test e2e-gcp-single-node
  • /test e2e-gcp-upgrade
  • /test e2e-metal-assisted
  • /test e2e-metal-ipi-ovn-dualstack
  • /test e2e-metal-ipi-ovn-ipv6
  • /test e2e-openstack
  • /test e2e-openstack-dualstack
  • /test e2e-openstack-externallb
  • /test e2e-openstack-hypershift
  • /test e2e-openstack-parallel
  • /test e2e-openstack-singlestackv6
  • /test e2e-ovirt
  • /test e2e-ovirt-upgrade
  • /test e2e-ovn-step-registry
  • /test e2e-vsphere
  • /test e2e-vsphere-ovn-upi
  • /test e2e-vsphere-ovn-upi-zones
  • /test e2e-vsphere-ovn-zones
  • /test e2e-vsphere-upgrade
  • /test okd-e2e-aws
  • /test okd-e2e-gcp-op
  • /test okd-e2e-upgrade
  • /test okd-e2e-vsphere
  • /test okd-images
  • /test okd-scos-images
  • /test security

Use /test all to run the following jobs that were automatically triggered:

  • pull-ci-openshift-machine-config-operator-master-bootstrap-unit
  • pull-ci-openshift-machine-config-operator-master-e2e-aws-ovn
  • pull-ci-openshift-machine-config-operator-master-e2e-aws-ovn-upgrade
  • pull-ci-openshift-machine-config-operator-master-e2e-aws-ovn-upgrade-out-of-change
  • pull-ci-openshift-machine-config-operator-master-e2e-azure-ovn-upgrade-out-of-change
  • pull-ci-openshift-machine-config-operator-master-e2e-gcp-op
  • pull-ci-openshift-machine-config-operator-master-e2e-gcp-op-single-node
  • pull-ci-openshift-machine-config-operator-master-e2e-gcp-op-techpreview
  • pull-ci-openshift-machine-config-operator-master-e2e-hypershift
  • pull-ci-openshift-machine-config-operator-master-e2e-openstack
  • pull-ci-openshift-machine-config-operator-master-e2e-vsphere-ovn-upi
  • pull-ci-openshift-machine-config-operator-master-e2e-vsphere-ovn-upi-zones
  • pull-ci-openshift-machine-config-operator-master-e2e-vsphere-ovn-zones
  • pull-ci-openshift-machine-config-operator-master-images
  • pull-ci-openshift-machine-config-operator-master-security
  • pull-ci-openshift-machine-config-operator-master-unit
  • pull-ci-openshift-machine-config-operator-master-verify

In response to this:

/test e2e-metal-ipi

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mkowalski
Copy link
Contributor Author

/test e2e-metal-ipi-ovn-dualstack
/test e2e-metal-ipi-ovn-ipv6

1 similar comment
@mkowalski
Copy link
Contributor Author

/test e2e-metal-ipi-ovn-dualstack
/test e2e-metal-ipi-ovn-ipv6

This PR changes the logic of how NetworkManager communicates changes in
the environment and how they are picked by on-prem-resolv-prepender.

Previously the NM dispatcher script had a logic that would trigger a
systemd service (either to start it or to restart). This proven to be
problematic, prone to race conditions and in principle a complicated
design.

Now we are moving to a model where systemd on its own will decide what
and when to restart, in our case by leveraging the systemd path units.

NM dispatcher is now responsible only for writing a correct environment
file (we need that in case DNS search domains change). Systemd path unit
observes the file and if a change is detected, it will trigger whatever
is necessary.

There is other stuff in the NM dispatcher script that we are keeping as
it is out of scope of this refactor.

Closes: OPNET-569
@mkowalski
Copy link
Contributor Author

/test e2e-metal-ipi-ovn-dualstack
/test e2e-metal-ipi-ovn-ipv6
/test e2e-openstack
/test e2e-vsphere-ovn-upi

@mkowalski
Copy link
Contributor Author

/test e2e-openstack
/test e2e-metal-ipi-ovn-dualstack

@mkowalski
Copy link
Contributor Author

/retest-required

@cybertron
Copy link
Member

/lgtm

This removes less code than I expected, but looking through it I don't see anything else that I'm confident we could remove so let's go with this version. I guess it was wishful thinking on my part. :-)

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 28, 2024
@LorbusChris
Copy link
Member

/approve

Copy link
Contributor

openshift-ci bot commented Oct 29, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cybertron, LorbusChris, mkowalski

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 29, 2024
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD c499c1b and 2 for PR HEAD 3f9aced in total

1 similar comment
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD c499c1b and 2 for PR HEAD 3f9aced in total

Copy link
Contributor

openshift-ci bot commented Oct 29, 2024

@mkowalski: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-ovn-upgrade-out-of-change 3f9aced link false /test e2e-azure-ovn-upgrade-out-of-change

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD c499c1b and 2 for PR HEAD 3f9aced in total

@openshift-merge-bot openshift-merge-bot bot merged commit b3a562b into openshift:master Oct 29, 2024
18 of 20 checks passed
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-machine-config-operator
This PR has been included in build ose-machine-config-operator-container-v4.18.0-202410292238.p0.gb3a562b.assembly.stream.el9.
All builds following this will include this PR.

@mkowalski mkowalski deleted the OPNET-569 branch January 10, 2025 15:33
@mkowalski
Copy link
Contributor Author

/jira backport release-4.17,release-4.16,release-4.15

@openshift-ci-robot
Copy link
Contributor

@mkowalski: Missing required branches for backport chain:

  • openshift-4.18 OR release-4.18,

In response to this:

/jira backport release-4.17,release-4.16,release-4.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mkowalski
Copy link
Contributor Author

/jira backport release-4.18,release-4.17,release-4.16,release-4.15

@openshift-ci-robot
Copy link
Contributor

@mkowalski: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.18
/cherrypick release-4.17
/cherrypick release-4.16
/cherrypick release-4.15

In response to this:

/jira backport release-4.18,release-4.17,release-4.16,release-4.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@openshift-ci-robot: new pull request created: #4783

In response to this:

@mkowalski: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.18
/cherrypick release-4.17
/cherrypick release-4.16
/cherrypick release-4.15

In response to this:

/jira backport release-4.18,release-4.17,release-4.16,release-4.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot

@openshift-ci-robot: new pull request created: #4784

In response to this:

@mkowalski: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.18
/cherrypick release-4.17
/cherrypick release-4.16
/cherrypick release-4.15

In response to this:

/jira backport release-4.18,release-4.17,release-4.16,release-4.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot

@openshift-ci-robot: new pull request created: #4785

In response to this:

@mkowalski: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.18
/cherrypick release-4.17
/cherrypick release-4.16
/cherrypick release-4.15

In response to this:

/jira backport release-4.18,release-4.17,release-4.16,release-4.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot

@openshift-ci-robot: new pull request could not be created: failed to create pull request against openshift/machine-config-operator#release-4.18 from head openshift-cherrypick-robot:cherry-pick-4654-to-release-4.18: status code 422 not one of [201], body: {"message":"Validation Failed","errors":[{"resource":"PullRequest","code":"custom","message":"No commits between openshift:release-4.18 and openshift-cherrypick-robot:cherry-pick-4654-to-release-4.18"}],"documentation_url":"https://docs.github.com/rest/pulls/pulls#create-a-pull-request","status":"422"}

In response to this:

@mkowalski: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.18
/cherrypick release-4.17
/cherrypick release-4.16
/cherrypick release-4.15

In response to this:

/jira backport release-4.18,release-4.17,release-4.16,release-4.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants