Skip to content

filter_grep: respect all rules even when a match is found #3345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

atheriel
Copy link
Contributor

@atheriel atheriel commented Apr 8, 2021

Currently the grep filter will keep data when the first positive match for a rule is found, regardless of whether the data would be excluded by another rule later on. An example is given in #3259.

This leads to surprising behaviour when combining Regex rules with one another or when combining Regex rules with Exclude rules:

  • The order of Regex rules matters, but the order of Exclude rules does not.

  • Mixing Regex and Exclude rules will result in the Exclude rules being ignored if any of the Regex rules match.

This commit changes the existing behaviour so that inputs must pass all rules.

I've also included new runtime tests for this behaviour.

This is a breaking change for users with multiple Regex rules in a single block that rely on the existing short-circuiting behaviour. However, I think the new behaviour is consistent with the existing documentation, so I'm not sure why anyone would be doing this.

Fixes #3259.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • [N/A] Example configuration file for the change
  • [N/A] Debug log output from testing the change
  • [N/A] Attached Valgrind output that shows no leaks or memory corruption was found

Documentation

  • [N/A] Documentation required for this feature

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Currently the filter will keep data when the first positive match for a
rule is found, regardless of whether the data would be excluded by
another rule later on.

This leads to surprising behaviour when combining Regex rules with one
another or when combining Regex rules with Exclude rules:

* The order of Regex rules matters, but the order of Exclude rules does
  not.

* Mixing Regex and Exclude rules will result in the Exclude rules being
  ignored if *any* of the Regex rules match.

This commit changes the existing behaviour so that inputs must
pass *all* rules. It includes runtime tests for this behaviour.

This is a **breaking change** for users with multiple Regex rules in a
single block that rely on the existing short-circuiting behaviour.

Fixes fluent#3259.

Signed-off-by: Aaron Jacobs <[email protected]>
@github-actions
Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label May 14, 2021
@atheriel
Copy link
Contributor Author

Not stale, still fixes an active issue.

@github-actions github-actions bot removed the Stale label May 15, 2021
@github-actions
Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Jun 15, 2021
@atheriel
Copy link
Contributor Author

Still an active issue.

@github-actions github-actions bot removed the Stale label Jun 16, 2021
@nokute78
Copy link
Collaborator

@atheriel Thank you for contribution and sorry for late reply.

Can you add a configuration parameter to select current/this behavior ?
I think it is better to be selectable by user.

I prefer the default is current behavior to prevent breaking change.
However, this new behavior may be better as a default.. Hmm....

@atheriel
Copy link
Contributor Author

@nokute78 I'm happy to add a configuration value if needed, but I'm not sure I understand your comment:

However, this new behavior may be better as a default.. Hmm....

Do you want this to be the default or not? I'd prefer to avoid implementing the configuration option if it will not be used.

@nokute78
Copy link
Collaborator

@atheriel
In my opinion, the configuration parameter is needed to prevent breaking change, so default is current behavior.
On the other hand, this new behavior is proper.
I think a gradual transition is good.

  1. Add configuration parameter and warning message to notify user uses default value and it will be changed in the future.
  2. In future release (e.g. next major version) , change default of configuration parameter. Then, this new behavior (respecting all rules) will be default.

This PR is to support 1.
How about this ?

@github-actions
Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Nov 26, 2024
@github-actions github-actions bot removed the Stale label Mar 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiple Regex entries for grep filter not working
3 participants