Skip to content

Conversation

@rodnymolina
Copy link
Member

Fixed a few failing testcases (mainly related to old K8s releases) and created a new CI workflow for PR testing.

Changes:

  - Pin Flannel to v0.25.1 for compatibility with K8s 1.32 and containerd
  - Add coredns_fix_loop() function to prevent DNS forwarding loops
  - Configure CoreDNS to use external DNS (8.8.8.8) instead of /etc/resolv.conf
  - Automatically apply CoreDNS fix (its configmap) during cluster initialization

The above fixes the CoreDNS loop issue that occurs in Kubernetes-in-Docker setups. Here's the explanation:

  1. CoreDNS Configuration: By default, CoreDNS (the DNS server for Kubernetes) is configured to forward DNS queries it can't resolve to the nameservers listed in /etc/resolv.conf
  2. Container Nesting: In our setup, we have:
    - Host linux machine
    - K8s node container (running inside a priv container with Docker+Sysbox)
    - Pods inside the K8s node (running inside the K8s node container)
  3. The Loop: Inside the K8s node container, /etc/resolv.conf points back to 127.0.0.1 (localhost) or to the container's own IP address. This creates a circular reference:
    - Pod needs to resolve DNS → asks CoreDNS
    - CoreDNS can't resolve → forwards to /etc/resolv.conf
    - /etc/resolv.conf points to 127.0.0.1 → goes back to CoreDNS
    - Infinite loop detected!
  4. CoreDNS Loop Detection: CoreDNS has a "loop" plugin that detects this circular forwarding and crashes the pod with a FATAL error to prevent infinite loops: ```[FATAL] plugin/loop: Loop (127.0.0.1:41329 -> :53) detected```

  The Fix is to replace the CoreDNS forwarding target from /etc/resolv.conf to external DNS servers (8.8.8.8, 8.8.4.4):

Signed-off-by: Rodny Molina <[email protected]>
This provides automated validation of code changes before merge, using the same containerized test environment as our local development.

Signed-off-by: Rodny Molina <[email protected]>
Copy link
Member

@ctalledo ctalledo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but have a question on the need for the change in kindbox.

fi
}

function coredns_fix_loop() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering why this is needed, given that sysbox-runc has code to modify the /etc/resolv.conf file inside the Docker container, such that it points to real nameservers (see here).

IOW, if things are working right, sysbox should have modified the DNS server addr in the K8s node container from Docker's internal DNS -> container's default route.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants