Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/metrics and /api_metrics endpoint does not show the generic API metrics for the director's endpoints #2467

Open
Malsourie opened this issue Aug 31, 2023 · 5 comments

Comments

@Malsourie
Copy link
Contributor

Describe the bug
According to documentation the /metrics and /api_metrics endpoint will expose the generic API metrics for the director's endpoints including number of requests and response time. But when we call those endpoints:

  1. /metrics only exposes the metrics for /metrics and /api_metrics endpoints
  2. /api_metrics only return OK

To Reproduce

  1. Deploy a bosh director
  2. curl some endpoints of bosh, e.g. vms, deployments, etc.
  3. curl http://<bosh_ip>:9092/metrics
  4. curl curl http://<bosh_ip>:9092/api_metrics

Expected behavior
The endpoint should return metrics of bosh endpoints.

@mvach
Copy link
Contributor

mvach commented Sep 6, 2023

Hi,
as show case I implemented a tiny puma and prometheus-client integration which serves the expected webserver access metrics https://github.com/mvach/PumaMetricsExample.

Sadly I don't see the difference to the current director implementation right now.

@beyhan
Copy link
Member

beyhan commented Sep 7, 2023

I'm not sure how the generic API metrics are supposed to work at all because:

  • As defined in the director job the metrics-server is started as a different process. You can check this also on a director VM with netstat -tulpn | grep 9092 & ps -aux | grep <pid-from-previous-command>.
  • bosh-director-metrics-server starts and registers the Prometheus collector for itself
  • the metrics_collector collects only the bosh metrics and no generic API metrics.

You see only metrics for the /metrics endpoint because this is the only endpoint you call on the metrics server. Maybe I miss something here but this is my current understanding.

@mvach
Copy link
Contributor

mvach commented Sep 7, 2023

:-) @beyhan,
I just noticed that right now and wanted to update the issue.

@jpalermo
Copy link
Member

I was able to get api metrics from /api_metrics. I did have to enable the metrics and I think I got the OK response when the metrics were NOT enabled, so that might have been part of the problem.

The /api_metrics endpoint does map to director web process, so in theory it would have access to this data. However I'm not sure how accurate the data is. The transition from thin to puma might have introduced separate datasets for each of the forked processes puma creates. So it does return data, but I didn't have a chance yet to verify that the data is actually correct.

@beyhan
Copy link
Member

beyhan commented Sep 22, 2023

Good catch @jpalermo. This is the commit which introduced the change. It looks like initially the /api_metrics were called /director_metrics but yes they redirect to the director process which I missed.

The transition from thin to puma might have introduced separate datasets for each of the forked processes puma creates. So it does return data, but I didn't have a chance yet to verify that the data is actually correct.

This is a good question. It looks to me that they should be accurate because the director internal metrics are gathered from the DB data, but you "never know" :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Waiting for Changes | Open for Contribution
Development

No branches or pull requests

4 participants