-
Notifications
You must be signed in to change notification settings - Fork 709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Troubleshooting Windows Exporter on EKS: Metrics Missing for Exporter Container and Callback/Connection Errors in Logs #1861
Comments
Hi,
Yes! Correct. windows_exporter runs as hostprocess to access the context on a host. hostprocess means, just start the process as normal process, like the service manager would do. In conclusion, the Host Compute System (HCS) does not offer metrics for that kind of processes. This can mit mitigated by using the process collector.
It seems that the request takes longer that the configured prometheus scrape timeout. Prometheus aborted the request while windows_exporter tries write data on the connection.
the service collector was known for slow times. This can be mitigated in 0.26 by using
The messages are unknown to me. It seems they are coming from https://github.com/microsoft/hcsshim/blob/8d81359dc374e39d9edd63639a0402fbbea694f9/internal/hcs/waithelper.go#L37 directly, since the log format is different. It could be possible that this is a follow-up error. if prometheus cancels the request, other contexts with-in the exporter are cancel as will which may lead to that situation. Action items:
There are tons of performance fixes in 0.29 and 0.30 which may resolve the issues as well. |
Thanks a lot @jkroepke for the time taken to give this thorough response, I really appreciate it. For errors found in logs, instead, I've updated the chart to the latest version
The So I think that we can close this one and maybe I can try to help with some debugging info if it's useful for the project. |
Technically its a bug, and all bug should be concerned. I might have to raise an upstream issue here. I need the version of the container as well. In your case, I need the AMI of your windows nodes. All other infomations can be found here: https://docs.aws.amazon.com/eks/latest/userguide/eks-ami-versions-windows.html |
Do you have metrics like |
Hi @jkroepke
Yes, I can confirm that |
Problem Statement
I apologize in advance if the overall request could seem confused, I'll try to do my best to explain the problem I'm facing. They're actually two "problems", the double hyphen is mandatory here as it could very be a wrong interpretation of what the exporter is doing, or a misconfiguration on my side. On top of all, I'm absolutely not an expert about Windows operating systems.
That said, I've got a windows exporter installation on a EKS cluster. There are two things that I do not understand:
I can correctly get metrics from all running containers but the windows exporter itself, for example by using the
windows_container_cpu_usage_seconds_total
metric. Is that correct? It is due to the the fact that the exporter is running as a Windows process?I've got a lot of unclear error in logs, and I do not know what their meaning is:
Anyone can give help? I'm not clearly understanding what it's going on, unfortunately.
It's not immediate to update to newer versions of the exporter, but I could proceed in that sense if it's the better way to proceed.
Thanks
Environment
The text was updated successfully, but these errors were encountered: