-
Notifications
You must be signed in to change notification settings - Fork 1.5k
CORS-4072: [Draft] Dual stack support for AWS #9930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@tthvo: This pull request references CORS-4072 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "4.21." or "openshift-4.21.", but it targets "openshift-4.20" instead. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/cc @mtulio Just rough hacks but in case you are interested :D |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@tthvo: This pull request references CORS-4072 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "4.21." or "openshift-4.21.", but it targets "openshift-4.20" instead. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
0e37a46 to
537f4d0
Compare
| infraStack: | ||
| description: |- | ||
| InfraStack indicates the network stack of the cluster infrastructure. | ||
| If left empty, the installer will figure it out from the machineNetwork. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we decide on this behavior or were we defaulting to IPv4 Only ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh we haven't decided at all. I just proposed this idea for discussion with the foundation:
- For AWS, we only supports IPv4 currently.
- Users might be thinking of specifying the
machineNetworkto add IPv6, similar to other platforms. In this case, it will be BYO VPC/subnets for AWS where VPC/subnet IPv6 CIDR is already known.
Open for ideas or thoughts on this approach 💭 🙏
| defaultServiceNetwork = ipnet.MustParseCIDR("172.30.0.0/16") | ||
| defaultIpv6ServiceNetwork = ipnet.MustParseCIDR("fd02::/112") | ||
| defaultClusterNetwork = ipnet.MustParseCIDR("10.128.0.0/14") | ||
| defaultIpv6ClusterNetwork = ipnet.MustParseCIDR("fd01::/48") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the new default service and cluster networks coming from a source or was this essentially the equivalent of the ipv4 string ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, these are arbitrary ULA IPv6 ranges (RFC 4193). We are still unsure of what default values to use...
For now, I picked these values from official doc: https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/ovn-kubernetes_network_plugin/converting-to-dual-stack
|
/test e2e-aws-default-config e2e-aws-ovn-shared-vpc-custom-security-groups |
|
/retitle CORS-4072: [Draft] Dual stack support for AWS This PR is for experimenting and collecting info about what changes are needed. I will separate the commits into smaller PRs :D PTAL 🙏 All reviews and nitpicks are appreciated! |
|
@tthvo: This pull request references CORS-4072 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
b1d688a to
52843a3
Compare
|
Hmm 🤔 Some failed jobs are complaining about missing permission level=warning msg=Condition S3BucketCreated has status: "False", reason: "S3BucketCreationFailed",
message: "ensuring bucket lifecycle configuration: creating S3 bucket lifecycle configuration: operation error
S3: PutBucketLifecycleConfiguration, https response error StatusCode: 403, RequestID: REDACTED, HostID: REDACTED,
api error AccessDenied: User: arn:aws:iam::REDACTED:user/ci-op-f2bmbtq3-247a4-minimal-perm-installer
is not authorized to perform: s3:PutLifecycleConfiguration on
resource: \"arn:aws:s3:::openshift-bootstrap-data-ci-op-f2bmbtq3-247a4-9dvdj\" because
no identity-based policy allows the s3:PutLifecycleConfiguration action"It might be something new added in latest CAPA that we will need to be aware of 👀 But regular admin level should be just fine to install dualstack with the PR! |
|
@tthvo: This pull request references CORS-4072 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
1. Relax validations to allow dualstack IPv6 on AWS 2. Validate subnet IPv6 CIDR if any 3. Configure cloud-config configMap to set NodeFamilies 4. [hack] Set the default ingress controller to use NodePort publish strategy 5. Set cluster Ingress to use NLB when IPv6 is enabled (optional) 6. Add a custom DNS controller manifest to configure IPv6 nameserver
The installconfig in the cluster-config ConfigMap needs to have the Ipv6 CIDR of the VPC in the case of full IPI.
…-network-server The commit ensures all service networks are considered (i.e. that is all IP families) when generating the certificate kube-apiserver-service-network-server.
FIXME: we should use the VPC CIDR as the source CIDRs. But the IPv6 cidr is not yet knowned at install time. We should edit the awscluster after infraReady to add the VPC IPv6 CIDR as source instead.
FIXME: CCM needs to handle this
This applies to dualstack installation only. IPv4-primary: IPv4 Target Group IPv6-primary: Ipv6 Target Group
|
@tthvo: This pull request references CORS-4072 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
The rebase is to stay on top of |
|
@tthvo: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
I rebuilt another release image: quay.io/thvo/origin-release:v4.21.0-preview-1. This includes the changes for openshift/cluster-network-operator#2804 instead of my own hack tthvo/cluster-network-operator@617e05f. If you'd like to use the new custom release image, you need to set the techpreview feature set: featureSet: TechPreviewNoUpgrade |
|
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Important
A rough draft of installer changes required to support dual-stack environment on AWS.
This PR is only for previewing the changes and experimenting with upstream CAPA PR. I will close this and open another PR with finalized sets of changes.
This PR also includes commits (message starting with
hack:) to "imitate" CCM and Cluster Ingress Operator to create necessary resources for cluster ingress (i.e. NLB, Route53 records, Security Groups, etc). These commits are to be removed, assuming AWS CCM support dualstack LB later on.This depends on upstream CAPA PR: kubernetes-sigs/cluster-api-provider-aws#5603 (not finalized yet).
How to install
Below is the details of how to reproduce the installation.
Custom release image
Custom release image: quay.io/thvo/origin-release:v4.21.0-preview.
This includes the following operator changes:
For the
cluster-network-operator, we have the open PR here with feature gate checking: openshift/cluster-network-operator/pull/2804Install Config
Use the below
install-configsnippet to configure networking and AWS platform.Note:
machineNetworkdoes not contain IPv6 CIDR as it is unknown at install time (i.e. will be patched later when infra is ready). The cluster network and service network contain ULA IPv6 CIDR.IPv4 Primary:
IPv6 Primary:
Important notes: [IPv6-primary only] The ingress operator will be stuck as health check on targets are failing because the k8s Service for ingress routers only have IPv6 cluster IP. The hacks only configures the ingress LB target group as IPv4, thus the connection cannot switch to IPv6 when travelling internally.
You must edit the that service
openshift-ingress/router-nodeport-defaultto set itsipFamilyPolicytoPreferDualStack. For example:$ kubectl -n openshift-ingress patch svc router-nodeport-default \ -p '{"spec":{"ipFamilyPolicy":"PreferDualStack"}}'Installer binary
It looks the installer binary can be built from these commits despite a reference to my local CAPA fork. So, just:
/hold
/label platform/aws