Skip to content

http_server/health: Implement throughput health check #5773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

tarruda
Copy link

@tarruda tarruda commented Jul 22, 2022

Signed-off-by: Thiago Padilha [email protected]

This enhances the /api/v1/health endpoint with an optional throughput check.

hc_throughput is used to enable the check (disabled by default). Note that there are no defaults, if hc_throughput is provided, then all other options must also be configured.

  • hc_throughput_input_plugins: Comma separated list of input plugins used to calculate input rate
  • hc_throughput_output_plugins: Comma separated list of output plugins used to calculate output rate
  • hc_throughput_ratio_threshold: Ratio of output rate and input rate at which we consider the throughput to have problems.
  • hc_throughput_min_failures: Minimum count at which ratio is below threshold for the health check to return error.

Example:

    hc_throughput                  On
    hc_throughput_input_plugins    tail.0
    hc_throughput_output_plugins   http.0
    hc_throughput_ratio_threshold  0.1
    hc_throughput_min_failures     60

In the above example, we would consider http.0/tail.0 for output/input rate calculation, and if the ratio is below 0.1 for 60 consecutive checks, then the health check would return an error. Note that we use the health check period (configured by hc_period) which has a default value of 1 second. So in this case we would only return an error if ratio is below threshold for 1 minute.

@agup006 @lecaros


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

tarruda pushed a commit to tarruda/fluent-bit-docs that referenced this pull request Jul 22, 2022
tarruda pushed a commit to tarruda/fluent-bit-docs that referenced this pull request Jul 22, 2022
@tarruda
Copy link
Author

tarruda commented Jul 22, 2022

Docs PR: fluent/fluent-bit-docs#850

Copy link
Contributor

@leonardo-albertovich leonardo-albertovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything except the lacking flb_free calls is only relevant if we plan to upstream this code.

@tarruda tarruda force-pushed the implement-throughput-health-check branch 2 times, most recently from 5860b2e to 2240019 Compare July 22, 2022 15:22
@tarruda tarruda force-pushed the implement-throughput-health-check branch from a9b0cfb to 447be26 Compare July 23, 2022 15:26
@github-actions
Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Oct 22, 2022
Use a ring buffer for storing samples as per Leonardo's suggestion.

Signed-off-by: Thiago Padilha <[email protected]>
Move struct definitions to header.

Signed-off-by: Thiago Padilha <[email protected]>
@tarruda tarruda force-pushed the implement-throughput-health-check branch from 447be26 to c280b8d Compare December 7, 2022 09:04
@tarruda tarruda temporarily deployed to pr December 7, 2022 09:04 — with GitHub Actions Inactive
@tarruda tarruda temporarily deployed to pr December 7, 2022 09:04 — with GitHub Actions Inactive
@tarruda
Copy link
Author

tarruda commented Dec 7, 2022

@lecaros rebase done

@tarruda tarruda temporarily deployed to pr December 7, 2022 09:22 — with GitHub Actions Inactive
@patrick-stephens patrick-stephens temporarily deployed to integration December 7, 2022 19:53 — with GitHub Actions Inactive
@patrick-stephens patrick-stephens temporarily deployed to integration December 7, 2022 19:59 — with GitHub Actions Inactive
@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2023

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Mar 8, 2023
@github-actions github-actions bot removed the Stale label Aug 15, 2024
Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Nov 27, 2024
@github-actions github-actions bot removed the Stale label Mar 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants