MDEV-34836: TOI on parent table must BF abort SR in progress on a child #3534
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Applied SR transaction on the child table was not BF aborted by TOI running on the parent table for several reasons:
Although SR correctly collected FK-referenced keys to parent, TOI in Galera disregards common certification index and simply sets itself to depend on the latest certified write set seqno.
Since this write set was the fragment of SR transaction, TOI was allowed to run in parallel with SR presuming it would BF abort the latter.
At the same time, DML transactions in the server don't grab MDL locks on FK-referenced tables, thus parent table wasn't protected by an MDL lock from SR and it couldn't provoke MDL lock conflict for TOI to BF abort SR transaction. In InnoDB, DDL transactions grab shared MDL locks on child tables, which is not enough to trigger MDL conflict in Galera.
InnoDB-level Wsrep patch didn't contain correct conflict resolution logic due to the fact that it was believed MDL locking should always produce conflicts correctly.
The fix brings conflict resolution rules similar to MDL-level checks to InnoDB, thus accounting for the problematic case.
Apart from that, wsrep_thd_is_SR() is patched to return true only for executing SR transactions. It should be safe as any other SR state is either the same as for any single write set (thus making the two logically equivalent), or it reflects an SR transaction as being aborting or prepared, which is handled separately in BF-aborting logic, and for regular execution path it should not matter at all.
Release Notes
How can this PR be tested?
MTR test is provided.
Basing the PR against the correct MariaDB version
main
branch.PR quality check