-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-48469: GCP: Update /etc/hosts file when ClusterHostedDNS is enabled #4800
base: master
Are you sure you want to change the base?
Conversation
@sadasu: This pull request references Jira Issue OCPBUGS-48469, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
1bc96ca
to
3280ff8
Compare
apiServerIntURL={{ .Infra.Status.APIServerInternalURL }} | ||
# Add the/etc/hosts configuration file | ||
mkdir -p /etc/hosts/conf.d | ||
cat <<EOF | tee /etc/hosts/conf.d/etc-hosts.conf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could maybe name the file api.conf
or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping that naming it as etc-hosts.conf
would be it obvious that this file contains some configuration for /etc/hosts
. Happy to call it api.conf
is it contains information about resolving the API/-Int urls.
{{ else }} | ||
exit 0 | ||
{{ end }} | ||
if [ -z "${apiIntLBIPs}" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be:
if [ -z "${apiIntLBIPs}" ]; then | |
if [ -z "{{$apiIntLBIPs}}" ]; then |
?
We appear to be missing a line like:
apiIntLBIPs={{$apiIntLBIPs}}
to get the template variable that we defined on line 19 into a bash variable.
mkdir -p /etc/hosts/conf.d | ||
cat <<EOF | tee /etc/hosts/conf.d/etc-hosts.conf | ||
# Added by OpenShift | ||
${apiLBIPs[0]} ${apiServerURL} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are treating apiLBIPs
as a bash array here. That's fine if it is, but bash arrays always seem like kind of a pain to set up to me. It might be simpler to do this in the template:
${apiLBIPs[0]} ${apiServerURL} | |
{{$apiLBIPs[0]}} ${apiServerURL} |
@@ -0,0 +1,11 @@ | |||
mode: 0755 | |||
path: "/usr/local/bin/update-etc-hosts" | |||
contents: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially a simpler way, avoiding the script and systemd service:
path: "/etc/hosts"
append:
- inline: |
{{ if and (eq .Infra.Status.PlatformStatus.Type "GCP") (.Infra.Status.PlatformStatus.GCP) (.Infra.Status.PlatformStatus.GCP.CloudLoadBalancerConfig) (eq .Infra.Status.PlatformStatus.GCP.CloudLoadBalancerConfig.DNSType "ClusterHosted") }}
{{ $apiIntLBIPs := cloudPlatformAPIIntLoadBalancerIPs . }}
{{ if len $apiIntLBIPs > 0 }}
{{ $apiLBIPs := cloudPlatformAPILoadBalancerIPs . }}
{{ if len $apiLBIPs > 0 }}{{ $apiLBIPs[0] }}{{ else }}{{ $apiIntLBIPs[0] }}{{ end }} {{ .Infra.Status.APIServerURL }}
{{ $apiIntLBIPs[0] }} {{ .Infra.Status.APIServerInternalURL }}
{{ end }}
{{ end }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The systemd service is providing us a way to time the running of this script before kubelet.
To check if feature to run in-cluster DNS on GCP and AWS is enabled by checking if the value of `PlatformStatus.GCP.CloudLoadBalancerConfig.DNSType` is set to `ClusterHosted`.
15c0227
to
3d451a6
Compare
91d44ca
to
1b13a9c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically seems fine, although I am unsure how to test, so let me know if you'd like any QE pre-merge testing on this
@@ -777,6 +778,33 @@ func cloudPlatformIngressLoadBalancerIPs(cfg RenderConfig) (interface{}, error) | |||
} | |||
} | |||
|
|||
// cloudPlatformLBIPAvailable returns true when DNSType is set to `ClusterHosted` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, based on the comment I'd expect some check for clusterhosted in the function. I guess it's implicit since the service enablement is dependent on this field?
(I know we do the same elsewhere in the template rendering, so I'm fine with it as is)
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sadasu, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@sadasu Hi, I just tried to use the latest commit to test the installation of GCP custom DNS. The
|
1b13a9c
to
77c93b5
Compare
Hi @sadasu still something wrong with the syntax of the bash in gcp-update-etc-hosts.service
|
77c93b5
to
171e94a
Compare
@gpei thanks for testing the 2 previous versions. I don't see why the bash script with correct syntax has failures when run within the systemd unit. I have reorganized my code and posted another version. |
@sadasu thanks for the update, with the latest code, we can move one step further now, but the Here are the contents of some key files on the master for your reference:
It looks like two empty lines have been added to file /etc/conf.d/etc-hosts.conf, the value of ${apiLBIP}/${apiServerURL}/${apiIntLBIP}/${apiServerIntURL} were not loaded when running |
fi | ||
mkdir -p /etc/conf.d | ||
etc_hosts_config_filename="/etc/conf.d/etc-hosts.conf" | ||
echo "${apiLBIP} ${apiURL%:*}" >> ${etc_hosts_config_filename} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another issue I'm seeing is, when I manually executed the following command
[root@gpei-0130-gcpdns-hq6q8-master-0 ~]# /usr/local/bin/update-etc-hosts 34.160.207.88 https://api.gpei-0130-gcpdns.qe.gcp.devcluster.openshift.com:6443 10.0.0.2 https://api-int.gpei-0130-gcpdns.qe.gcp.devcluster.openshift.com:6443
34.160.207.88 https://api.gpei-0130-gcpdns.qe.gcp.devcluster.openshift.com
10.0.0.2 https://api-int.gpei-0130-gcpdns.qe.gcp.devcluster.openshift.com
Done updating /etc/hosts
It's adding records like
10.0.0.2 https://api-int.gpei-0130-gcpdns.qe.gcp.devcluster.openshift.com
to /etc/hosts file, but it can't help on resolve the URL, so looks like we also need to remove the beginning "https://" in addition.
Something like below should work based on my test, just for your reference.
apiHostname=${apiURL#*//}
apiIntHostname=${apiIntURL#*//}
echo "${apiLBIP} ${apiHostname%%:*}" >> ${etc_hosts_config_filename}
echo "${apiIntLBIP} ${apiIntHostname%%:*}" >> ${etc_hosts_config_filename}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I had removed the tailing port number previously. Now updated to also remove the leading https://.
f790dfb
to
30ea55d
Compare
@gpei thanks for you attention to this work. I ended up moving some simple bash code back to templates/common/gcp/units/gcp-update-etc-hosts.service.yaml. My last attempt before I move to the use of "Environment". |
30ea55d
to
b02b304
Compare
/retest-required |
@sadasu Now a new error was raised in the gcp-update-etc-hosts.service
I tried to put the creating /etc/conf.d directory task into ExecStartPre as following, it's working but still nothing was written to "/etc/conf.d/etc-hosts.conf"
Not sure if it's related to the write permission on that directory somehow, but even I specified |
what if we created the file in I am working on another version, that moves all file operations back to templates/common/gcp/files/usr-local-bin-update-etc-hosts.yaml |
Append /etc/hosts files with entries to resolve cluster api and api-int URLS. /etc/hosts will provide resolution for these URLs until kubelet joins the cluster and runs its CoreDNS pod which will then take over resolution of those 2 URLs
Added tests to accomodate GCP in-cluster DNS config
8a70084
to
bd13be9
Compare
@sadasu: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@sadasu I think it's working as expected this time 🎉 IP address mapping of API/API-INT were both added into /etc/hosts, and masters are joined into the cluster.
|
The new problem I can see now is the workers are trying to fetch worker ignition from the original MCS address "https://api-int.gpei-0201-gcpdns.qe.gcp.devcluster.openshift.com:22623/config/worker" which couldn't be resolved, looks like we'll need to replace the URL to the internal LB IP address? Here's the console Serial of worker:
I believe we can create another bug to track this issue. |
Append /etc/hosts files with entries to resolve cluster api and api-int URLS. /etc/hosts will provide resolution for these URLs until kubelet joins the cluster and runs its CoreDNS pod which will then take over resolution of those 2 URLs
- What I did
- How to verify it
- Description for the changelog