Skip to content

Limit mysqld_exporter to at most 1 set of pending queries running per Prometheus scrape interval (merge fix from PMM) #746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ColinDKelley opened this issue Jun 27, 2023 · 0 comments

Comments

@ColinDKelley
Copy link
Contributor

ColinDKelley commented Jun 27, 2023

Host operating system: output of uname -a

Linux <pod> 4.19.0-18-cloud-amd64 #1 SMP Debian 4.19.208-1 (2021-09-29) x86_64 GNU/Linux

mysqld_exporter version: output of mysqld_exporter --version

0.12.1

MySQL server version

5.7

What did you do that produced an error?

We've periodically had a problem in production where the mysqld_exporter contributes to database overload. Here are the steps for how that happened:

  1. Our production database got bogged down running a very slow query.
  2. Prometheus hit mysqld_exporter at a scrape interval.
  3. The mysqld_exporter query took longer than the scrape interval to run. So before it could complete, we looped back to (2). They continued to stack up without bound.

Note: We submitted a fix for this 2 years ago on the Percona (PMM) fork. That has been running successfully ever since. Here is that issue and the percona PR. But recently we had a project that used this prometheus version of mysqld_exporter and after a few months of running that, we were bitten by the same unbounded query meltdown.

Here is a prometheus PR for this issue that backports the PMM fix above.

(Also note, although this ticket is related, it is not a duplicate, since there are many possible causes of query slowdown.)

What did you expect to see?

We expected the first mysqld_exporter query to return results as soon as possible and later attempts to run the same query to return a "429 Too Many Requests" error.

What did you see instead?

When the query time exceeded the scrape interval, we saw an unbounded set of mysqld_exporter queries running at the same time. These queries led to a meltdown where the mysqld_exporter eventually ran out of memory and was OOM-killed. And during that time, the excess mysqld_exporter queries contributed to the MySQL server overload.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant