Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

malformed metric name __ndows_system_boot_time_timestamp_seconds #1866

Open
cubakid opened this issue Feb 1, 2025 · 7 comments
Open

malformed metric name __ndows_system_boot_time_timestamp_seconds #1866

cubakid opened this issue Feb 1, 2025 · 7 comments
Labels

Comments

@cubakid
Copy link

cubakid commented Feb 1, 2025

Current Behavior

The metrics "windows_system_boot_time_timestamp_seconds" appears as "__ndows_system_boot_time_timestamp_seconds"

Expected Behavior

It should be "windows_system_boot_time_timestamp_seconds"

Steps To Reproduce

1. configure "system" collector
2. collect metrics

Environment

  • windows_exporter Version: {branch="HEAD",goarch="amd64",goos="windows",goversion="go1.23.4",revision="d31ce0507c76b4d57ca44fbba93875f38e59842d",tags="trimpath",version="0.30.1"}
  • Windows Server Version: 2012R2

windows_exporter logs

time=2025-02-01T12:26:47.180Z level=INFO source=config.go:79 msg="loading configuration file: C:\\Program Files\\windows_exporter_3\\config.yaml"
time=2025-02-01T12:26:47.181Z level=DEBUG source=main.go:191 msg="logging has Started"
time=2025-02-01T12:26:47.181Z level=DEBUG source=main.go:332 msg="setting process priority to normal"
time=2025-02-01T12:26:47.289Z level=INFO source=main.go:306 msg="Running as [redacted]"
time=2025-02-01T12:26:47.289Z level=INFO source=main.go:223 msg="Enabled collectors: system"
time=2025-02-01T12:26:47.289Z level=INFO source=main.go:241 msg="starting windows_exporter in 109.0028ms" version=0.30.1 branch=HEAD revision=d31ce0507c76b4d57ca44fbba93875f38e5984
2d goversion=go1.23.4 builddate=20250119-10:39:20 maxprocs=8
time=2025-02-01T12:26:47.290Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9183
time=2025-02-01T12:26:47.290Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9183
time=2025-02-01T12:26:48.847Z level=DEBUG source=collect.go:220 msg="collector system succeeded after 1.1291ms, resulting in 8 metrics"

Anything else?

No response

@jkroepke
Copy link
Member

jkroepke commented Feb 2, 2025

I can't confirm this on my end:

Image
Image

I guess, some relabel rules on your site result into this.

@cubakid
Copy link
Author

cubakid commented Feb 2, 2025

I didn't have any relabel rules, at least until I used one to change "__ndows..." to fix dashboards and alerts.

It occurs only with 0.30.1 (don't have other 0.30.x) and Server 2012R2 but not on the other Windows Server versions.

Specifically this one {name="windows_os_info", build_number="9600", instance="[redacted]:9182", job="windows", major_version="6", minor_version="3", product="Windows Server 2012 R2 Standard", revision="22267", version="6.3.9600"}

Also in browser...

Image

@jkroepke
Copy link
Member

jkroepke commented Feb 2, 2025

I have no idea, how this is possible.

I'm using the prometheus client libraries and a constant for all metrics. It strange to me that only one metrics is different.

Not sure, how much effort I will put into this. Windows Server 2012 R2 is only in best effort support mode. The reasons here is that the underlying language (go) requires at least Windows Server 2016. It might be possible that underlaying magic happens here which is not my control.

@cubakid
Copy link
Author

cubakid commented Feb 3, 2025

I understand that that 2012 is only best effort.

Upgraded to 0.30.2.

I ran Prometheus with debug and this is in the log...
ts=2025-02-03T22:54:59.171Z caller=scrape.go:1426 level=debug component="scrape manager" scrape_pool=windows target=http://[redacted]:9182/metrics msg="Append failed" err="invalid metric name: \u0000\u0000ndows_system_boot_time_timestamp_seconds"

And on a different server I found the same behavior but with different metric...
ts=2025-02-03T22:54:57.204Z caller=scrape.go:1426 level=debug component="scrape manager" scrape_pool=windows target=http://[redacted]:9182/metrics msg="Append failed" err="invalid metric name: \u0000\u0000ndows_net_packets_received_unknown_total"

I tried metric_relabel_configs in Prometheus configs since the browser shows it starting with __ (double underscore) and the Prometheus docs say that relabel is the last stage prior to ingesting and that anything starting with __ will be dropped. It didn't change anything.

So maybe it is "null" and not _ at the start?

@cubakid
Copy link
Author

cubakid commented Feb 4, 2025

It might have something to do with failed collectors.

I noticed the issue on a 2012R2 server running 0.29.2. So I ran windows_exporter interactively with --log.level=debug and saw the following error. I removed the IIS collector and the error went away, as expected, and so did the metric name issue. Combined with the \u0000 in the prior message, it feels like a pointer issue.

ts=2025-02-03T22:49:20.815-05:00 level=debug caller=unmarshal.go(perflib.UnmarshalObject):73 msg="missing counter "% 503 HTTP Response Sent", have [Current File Cache Memory Usage File Cache Hits URI Cache Misses
Metadata Cache Misses Metadata Cache Flushes File Cache Hits / sec Output Cache Current Flushed Items Active Threads Count Output Cache Current Memory Usage URI Cache Flushes % 500 HTTP Response Sent WebSocket Conne
ctions Rejected / Sec Output Cache Total Flushes Metadata Cache Hits / sec Uri Cache Misses / sec WebSocket Connection Attempts / Sec Total Flushed URIs Current Metadata Cached Output Cache Current Items File Cache
Misses Total Metadata Cached Output Cache Total Hits Uri Cache Hits / sec Active Requests _Base WebSocket Connections Accepted / Sec Current Files Cached Requests / Sec Active Flushed Entries Total URIs Cached Total
HTTP Requests Served Maximum Threads Count File Cache Flushes File Cache Misses / sec Metadata Cache Misses / sec % 401 HTTP Response Sent Total Files Cached Maximum File Cache Memory Usage Metadata Cache Hits % 40
3 HTTP Response Sent % 404 HTTP Response Sent Current URIs Cached URI Cache Hits Total Flushed Metadata Output Cache Hits / sec Total Threads Total Flushed Files Output Cache Total Misses Output Cache Total Flushed
Items Output Cache Misses / sec WebSocket Active Requests]" collector=iis collector=iis

@jkroepke
Copy link
Member

jkroepke commented Feb 4, 2025

I noticed the issue on a 2012R2 server running 0.29.2.

This error is an known issue, since the OLD IIS version does not provide all metrics. The issues was fixed in a later version (0.30 branch)

So maybe it is "null" and not _ at the start?

Yes, there are null bytes in front.

You can try to do something like this in relabel:

.*ndows -> windows

Maybe this helps.


0.29 and 0.30 are using the same go runtime. Maybe 0.28 help you here.

0.25 seems also a good option, if only system collector is needed. I guess old windows version just needs old exporter versions.

@cubakid
Copy link
Author

cubakid commented Feb 5, 2025

I tried this and no change:

metric_relabel_configs:
  - source_labels: [__name__]
    regex: .*ndows_system_boot_time_timestamp_seconds
    target_label: __name__
    replacement: windows_system_boot_time_timestamp_seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants