Exported services return 502 in a multicluster setup (Linkerd 2.11.1) #7615
-
What is the issue?A multi-cluster setup created by following the guide found here has issues when services are exported. All services can't be accessed because a 502 error appears. How can it be reproduced?It can be reproduced on a newly created linkerd 2.11.1 cluster and following this document Logs, error output, etcThe linkerd-proxy used by the gateway logs:
Every time it receives a request from a meshed pod in the target cluster output of
|
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 4 replies
-
The error message |
Beta Was this translation helpful? Give feedback.
-
Hi @adleong, the request is made from a meshed pod in the
The gateway in the source cluster only logs:
for that given request and rejects it. Regarding the trust root, we are definitely using the same ancho. I checked that again right now just to make sure that could be ruled out. Here is the
|
Beta Was this translation helpful? Give feedback.
-
One line that stands out to me in your logs is
|
Beta Was this translation helpful? Give feedback.
-
This is the configuration for my
The service annotation This is another request, this time to another service leading to the same result:
|
Beta Was this translation helpful? Give feedback.
-
Hmmmm I'm having a hard time figuring out how this is possible. Based on the logs, it seems like the destination service is returning an invalid identity (cert-manager-default.kube-system.svc.cluster.local:9402) but it's unclear to me what would cause it do this. This value should always be populated from the Furthermore, I can't reproduce this behavior by following the multicluster guide. If you can provide more detailed step-by-step instructions for reproducing this behavior (or even better, some automated scripts which demonstrate the issue) I will be able to dig in further. Otherwise, I'm not sure how to help. |
Beta Was this translation helpful? Give feedback.
-
@adleong I found what the issue is after much debugging: We had the |
Beta Was this translation helpful? Give feedback.
-
@adleong Thanks a lot for all the help! |
Beta Was this translation helpful? Give feedback.
@adleong I found what the issue is after much debugging: We had the
enableEndpointSlices
flag enabled which led to the error mentioned in this discussion. If you install linkerd2 via helm with that flag you should be able to reproduce it without a problem (I used K8s 1.20.X). Disabling that completely fixed the issue.