MDEV-34986: storage/innobase/dict/dict0dict: add a RAII class to freeze a dict_sys_t #3529

MaxKellermann · 2024-09-19T13:25:58Z

https://jira.mariadb.org/browse/MDEV-34986

Description

This replaces a lot of manual freeze() and unfreeze() call. Doing it with RAII is safer and easier.

I did not replace all freeze()/unfreeze() pairs because some callers unfreeze and re-freeze in the middle of a scope. Sometimes, adding a new scope can be added just for such a RAII object. Refactoring that can be done later.

Notes:

Instead of using the global variable dict_sys, I decided to pass a
dict_sys_t reference parameter, because I believe it will be
necessary to eliminate that global variable eventually (in order to
have a per-catalog instance). Hard-coding this global variable here
would generate identical (not better) machine code and would be a
step in the wrong direction.
The new macros SRW_LOCK_ARGS2 and SRW_LOCK_CALL2 were necessary
because this is the first time those debug-only parameters are
forwarded from a function that has more parameters (i.e. the
dict_sys_t reference).

Release Notes

Nothing. Internal code change only with no runtime effect.

How can this PR be tested?

No runtime effect.

Basing the PR against the correct MariaDB version

This is a new feature or a refactoring, and the PR is based against the main branch.
This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

CLAassistant · 2024-09-19T13:26:04Z

All committers have signed the CLA.

These macros shall be used if there are other arguments. Unlike the old variants, they start with a comma.

This replaces a lot of manual freeze() and unfreeze() call. Doing it with RAII is safer and easier. I did not replace all freeze()/unfreeze() pairs because some callers unfreeze and re-freeze in the middle of a scope. Sometimes, adding a new scope can be added just for such a RAII object. Refactoring that can be done later. Notes: - Instead of using the global variable `dict_sys`, I decided to pass a `dict_sys_t` reference parameter, because I believe it will be necessary to eliminate that global variable eventually (in order to have a per-catalog instance). Hard-coding this global variable here would generate identical (not better) machine code and would be a step in the wrong direction. - The new macros `SRW_LOCK_ARGS2` and `SRW_LOCK_CALL2` were necessary because this is the first time those debug-only parameters are forwarded from a function that has more parameters (i.e. the dict_sys_t reference).

MaxKellermann · 2024-09-20T06:41:34Z

CI failures due to a dumb typo; I force-pushed an amended version.

dr-m · 2024-09-20T07:14:05Z

Thank you for your contribution. It was great to hear your related talk in Berlin last Tuesday. It is a pity that we did not have a chance to talk after that. I did not have a chance to check the related iovisor/bcc#5108 yet.

With @montywi’s current (not yet published) implementation of catalogs, there will continue to be a single global InnoDB infrastructure, which would include the singleton dict_sys object for which I removed pointer indirection in 5fd7502.

For catalogs or multi-tenancy to make any sense, I think that at the minimum, the buffer pool must be shared. I understood that implementations in other RDBMS may have per-tenant file structure, such as separate write-ahead, transaction metadata and data dictionary for each tenant. That kind of a solution would require pushing down a catalog or tenant ID to the lower level, such as the buffer pool and mini-transactions. It could incur a considerable overhead for the case that there is only a single tenant or catalog. It would also be more complex to implement than the what @montywi currently has shared with your organization.

Long story short, I think that the singleton dict_sys object is there to stay.

The bug that was featured in your talk would not have been prevented by this refactoring. The innodb_sync_debug that was removed in db006a9 and ff5d306 might have. 5f2dcd1 improved the situation a little, by allowing us to keep track of all srw_lock::rd_lock() holders and to add debug assertions that a particular latch is not being held in shared (or any) mode by the current thread.

The fix b2654ba moves some logic that used to be duplicated into a non-inline function lock_table_children(). The problem was not that we’d forget to release a latch; the problem was that we were holding a latch while entering a wait for a transactional lock, which may block for a very long time (by default, or 50 seconds), even indefinitely. All this because FOREIGN KEY constraints are being handled at a too low level, or because there are two metadata caches that are covered by slightly different locks: TABLE_SHARE and dict_table_t.

I think that there can be some benefit of having RAII around synchronization primitives. We are using it in b08448d and a variant of it in mtr_t::m_memo. But I am not fully convinced about the usefulness of this change. If we went for it, I would like to assess if it has any impact on the generated code.

MaxKellermann · 2024-09-20T07:25:20Z

The bug that was featured in your talk would not have been prevented by this refactoring.

Of course not! This is just a source-code level change with no runtime effect. It helps keeping the code readable; and of course RAII is safer, less prone to human mistakes.

While MariaDB is formally written in C++, it could go much further to take advantage of C++'s features to make it safer. When I read the code to try to understand the dict_sys.latch problem last week, I kept wondering why it was that way (so many raw pointers, manual resource/lock management) - too error prone for my taste.
This PR is a tiny step in that direction; my suggestion to do more C++.

MaxKellermann · 2024-09-20T08:03:42Z

If we went for it, I would like to assess if it has any impact on the generated code.

The generated code is mostly identical. In a perfect world, all of this would collapse to exactly identical machine code; but there are compiler imperfections.

For example, GCC is a bit clumsy about the trivial method ha_innobase::referenced_by_foreign_key() and adds one extra instruction for no obvious reason.

Other functions may grow larger because RAII gives you stronger guarantees; i.e. the generated code is a bit longer, but actually better than before. RAII guarantees that the destructor will be called even if an exception was thrown. That requires a tiny bit of extra code for stack unwinding.

People have different opinions about C++ exceptions. For my own code, I'm more than willing to pay the tiny price for this stack unwinding code (with overhead that is hardly even measurable). But fact about MariaDB is that C++ exceptions are not disabled, therefore you are already paying that price. Exceptions are indeed used in very few places, but 99% of the code misses noexcept declarations, which forces the compiler to make everything stack-unwinding-safe, even where it's never necessary. (Of course, adding noexcept does have disadvantages of its own. Oh yes, that perfect world, I'd like to have it.)

My point is: the code is not 100% identical, but it is either

better: due to stronger guarantees (that may have actual runtime effects because you now get exception-safety for free)
negligible overhead due to compiler clumsiness

MaxKellermann · 2024-09-20T09:03:13Z

With this PR, the executable grows by 572 bytes.

But if I apply #3531 first, then this PR adds only 256 bytes, because the compiler can now omit some of the unnecessary stack unwinding safety.

dr-m · 2024-09-20T09:22:36Z

Would the executable size grow less if you did not add any data member to your RAII guard, but instead made it refer directly to the singleton dict_sys, similar to how LockMutexGuard and LockGuard are referring to lock_sys.latch?

MaxKellermann · 2024-09-20T09:33:23Z

Would the executable size grow less if you did not add any data member to your RAII guard, but instead made it refer directly to the singleton dict_sys, similar to how LockMutexGuard and LockGuard are referring to lock_sys.latch?

In all cases I analyzed in the disassembly, this reference was optimized away by GCC, because GCC knew it was guaranteed to always point to the global variable.

There could be optimization pitfalls in other/similar cases, though: if this was not a global object, but a global pointer to an object, GCC would not be disallowed to optimize it away because the pointer value could change meanwhile. But in that case, the generated code would still be better because GCC would (likely) cache the pointer in a register instead of having to access RAM again to reload the pointer.

Edit: I just verified - the binary is exactly identical whether I use the global dict_sys variable or store a dict_sys_t reference in the RAII class.

dr-m · 2024-09-20T11:42:51Z

Edit: I just verified - the binary is exactly identical whether I use the global dict_sys variable or store a dict_sys_t reference in the RAII class.

Thank you. With clang (which is the compiler on FreeBSD and macOS) it might be different, or maybe they have improved in recent releases.

By the way, while the dict_sys.latch is global, it is being used much less than its precursors dict_sys.mutex and dict_operation_lock used to be. In most cases, table metadata is being protected by reference counts or metadata locks (MDL, a lock on the name). Similarly, InnoDB data dictionary tables are being protected by transactional table locks.

The purpose of MDL or the old use of "global MDL" as in dict_sys.latch is to prevent a race condition between DDL and any concurrent operations on the table. Because the layer above InnoDB is largely ignorant about FOREIGN KEY, InnoDB is forced to acquire dict_sys.latch. Things could be much cleaner if the metadata storage was moved closer to the MDL logic, but implementing that is far from trivial.

MaxKellermann · 2024-09-23T11:01:42Z

I added a JIRA ticket. Do you want me to rebase this PR on a different branch?

svoj · 2024-12-03T21:49:24Z

FWIW...

Apparently back a while ago I did write Marko:

I have mixed feeling about RAII approach in mutex locking, hope it won't expand further.

I got request to elaborate this quote here.

There certainly are good things that RAII locking has to offer:

it allows not to bother with unlocking a primitive
consequently it allows to save one code line that unlocks primitive
and probably save a few more lines on handling variable that stores result of critical section

So here are problems that did make me come to this conclusion back then...

they encourage bad habit of not controlling critical section size, which may cause potential deadlocks (e.g. if it accidentally spans over another primitive lock), and worsen scalability
yes, in theory it can be avoided by adding code block {}, which (I think) doesn't make code prettier; and should you need to unlock mutex conditionally, the only option you have is goto out of the block
should I need to find all primitive locks in a set of files, I can't just grep for e.g. rd_lock, I have to cover all wrappers (and yes, I consider dict_sys.freeze() as another evil)
should I need to find all dict_sys.freeze() calls, I can't just grep for it, I'll have to find out that it has another wrapper and expand expression accordingly
also it may have to call destructor for objects declared after the scoped lock itself, again negative scalability effect

All in all (IMHO) it gives more headache than good.

The only use case for RAII locking that I'd grudgingly agree with must meet the following requirements:

locking primitive must be private member of a class
no class method is allowed to return with primitive locked (of course, with RAII there's no other option)
RAII should be used consistently, no direct locking primitive access
method protected by RAII lock should be reasonably trivial, critical section size controllable
in most cases there should be no other primitive locks within critical section

If you ask me, I'd put effort in getting rid of dict_sys.freeze() in the first place. E.g.

inline void UndorecApplier::apply_undo_rec(const trx_undo_rec_t *rec)
{
  ...
  dict_sys.freeze(SRW_LOCK_CALL);
  dict_table_t *table= dict_sys.find_table(table_id);
  dict_sys.unfreeze();
  ...
}

should either move freeze() inside find_table() or become dict_sys_t::find_table_with_lock() wrapper. In this case either fits my grudging requirement for RAII.

Another example:

dict_sys.freeze(SRW_LOCK_CALL);
table_stats= dict_acquire_mdl_shared<false>(table_stats,
                                            thd, &mdl_table);
dict_sys.unfreeze();

This should become member of dict_sys_t.

Good hunting!

DaveGosselin-MariaDB · 2024-12-10T15:47:54Z

Hi @svoj , just a few thoughts to share:

There certainly are good things that RAII locking has to offer:

it allows not to bother with unlocking a primitive

It saves the programmer from explicitly writing code to unlock a primitive, yet the primitive is still unlocked as intended by the programmer.

consequently it allows to save one code line that unlocks primitive

This isn't the most important part of RAII, what really matters is consistency of lock-unlock pairs are preserved as part of the system design inherently.

and probably save a few more lines on handling variable that stores result of critical section

I'm not sure what you meant here, can you please clarify?

So here are problems that did make me come to this conclusion back then...

they encourage bad habit of not controlling critical section size, which may cause potential deadlocks (e.g. if it accidentally spans over another primitive lock), and worsen scalability

yes, in theory it can be avoided by adding code block {}, which (I think) doesn't make code prettier; and should you need to unlock mutex conditionally, the only option you have is goto out of the block

Unfortunately, all programming constructs may be abused, nothing is foolproof. I'm not sure that this is a counterargument to using RAII. Code blocks are an intentional way of indicating the critical section and IIRC Facebook's folly library, for example, requires them to make explicit the critical section. We could model that approach in our solution and it would be clear if critical sections grew too large at sight. That's not possible with our current approach today.

should I need to find all primitive locks in a set of files, I can't just grep for e.g. rd_lock, I have to cover all wrappers (and yes, I consider dict_sys.freeze() as another evil)

should I need to find all dict_sys.freeze() calls, I can't just grep for it, I'll have to find out that it has another wrapper and expand expression accordingly

Both of these points are more of a tooling question as LSP + vim or LSP/eglot + emacs (and other tools) can easily find such references.

also it may have to call destructor for objects declared after the scoped lock itself, again negative scalability effect

This sounds like a correctness problem in the original code; shouldn't destructors be called when objects go out of scope?

The only use case for RAII locking that I'd grudgingly agree with must meet the following requirements:

locking primitive must be private member of a class

no class method is allowed to return with primitive locked (of course, with RAII there's no other option)

RAII should be used consistently, no direct locking primitive access

method protected by RAII lock should be reasonably trivial, critical section size controllable

in most cases there should be no other primitive locks within critical section

These are reasonable to me, with flexibility on (5) as you allowed already.

dr-m · 2024-12-10T16:18:30Z

I would also like to point out that I am working on reducing contention on dict_sys.latch, and therefore the code around this is going to change. I am currently working on MDEV-35000. In MDEV-34999 you can find other tickets in this area. MDEV-34988 is what I believe originally motivated @MaxKellermann to look at this. I’m very thankful for that analysis and insight. At the moment, I don’t think that implementing a RAII helper will help with any of those fixes. It is basically syntactic sugar. We can return to this after those performance problems have been fixed.

One point that @svoj already made is that a RAII helper easily leads to releasing latches in LIFO order. Sometimes we try to release some easily contended latches in FIFO order in order to minimize the size of a critical section for a popular latch, such as dict_sys.latch, log_sys.latch, buf_pool.mutex. It is more likely that another thread is waiting for one of those, instead of a latch on a small object, such as buf_page_t::lock or dict_table_t::lock_latch. After all, other threads could be operating on entirely different tables or data pages.

svoj · 2024-12-10T19:29:57Z

@DaveGosselin-MariaDB,

...skip...

and probably save a few more lines on handling variable that stores result of critical section

I'm not sure what you meant here, can you please clarify?

What I meant was:

  dict_index_t*   index;
  dict_sys.freeze(SRW_LOCK_CALL);
  index = dict_index_get_if_in_cache_low(index_id);
  dict_sys.unfreeze();
  return(index);

Here index is critical section result, with RAII it is not needed, one can just return dict_index_get_if_in_cache_low().

should I need to find all primitive locks in a set of files, I can't just grep for e.g. rd_lock, I have to cover all wrappers (and yes, I consider dict_sys.freeze() as another evil)

should I need to find all dict_sys.freeze() calls, I can't just grep for it, I'll have to find out that it has another wrapper and expand expression accordingly

Both of these points are more of a tooling question as LSP + vim or LSP/eglot + emacs (and other tools) can easily find such references.

Probably they can find such references and probably they will be more bullet-proof compared to grep. OTOH it can happen that in real world simplicity beats advanced constructs.

also it may have to call destructor for objects declared after the scoped lock itself, again negative scalability effect

This sounds like a correctness problem in the original code; shouldn't destructors be called when objects go out of scope?

For contested locking primitives we may want to make critical section as small as possible. If destructors don't strictly need to be executed under protection, they should be moved out of critical section. In this case we have no control which destructor is called first. This is very similar to what @dr-m explained above.

Another question is if we can make use of std::lock_guard for the purpose of locking srw_lock and avoid adding custom classes?

DaveGosselin-MariaDB · 2024-12-10T21:27:00Z

What I meant was:
  dict_index_t*   index;
  dict_sys.freeze(SRW_LOCK_CALL);
  index = dict_index_get_if_in_cache_low(index_id);
  dict_sys.unfreeze();
  return(index);
Here index is critical section result, with RAII it is not needed, one can just return dict_index_get_if_in_cache_low().

Oh yes, this makes sense, thank you for clarifying.

For contested locking primitives we may want to make critical section as small as possible. If destructors don't strictly need to be executed under protection, they should be moved out of critical section. In this case we have no control which destructor is called first. This is very similar to what @dr-m explained above.

Understood, yes in general as little as necessary should be done in the critical section.

Another question is if we can make use of std::lock_guard for the purpose of locking srw_lock and avoid adding custom classes?

I would prefer this, I'm not sure how far back in terms of gcc/clang we need to support. Put another way, will all of our target toolchains support the standard of C++ required for std::lock_guard and friends? (If not, then we'd have to provide our own RAII of course)

dr-m · 2025-03-14T11:48:38Z

Meanwhile, #3729 was merged and #3889 should soon remove some code duplication around dict_sys.latch. There are several other improvements around dict_sys.latch planned, as I explained in my previous comment. I’m thankful that @MaxKellermann brought the contention on dict_sys.latch to my attention. I hadn’t realized that it can be so severe in I/O bound workloads, because InnoDB always misbehaved in that way. In fact, before MariaDB Server 10.6 did away with dict_sys.mutex and replaced some use of dict_sys.latch (originally known as dict_operation_lock) with finer-grained metadata locks (MDL), it was even worse.

Exclusive dict_sys.latch acquisition should be so rare that it probably is not worth the effort to make dict_sys.latch compatible wtih std::lock_guard. C++14 would introduce the more appropriate std::shared_lock, but unfortunately our code base is still stuck with C++11.

We can keep this ticket open as a reminder to revisit the topic, once we no longer hold dict_sys.latch during any buffer pool access. Or we could close this pull request now and review the merit of MDEV-34986 later.

svoj · 2025-03-14T19:28:33Z

@dr-m if pull request is not intended to be accepted it should be closed. MDEV should be the reminder to revisit the topic.

MaxKellermann added 2 commits September 19, 2024 19:39

storage/innobase/srw_lock: add macros SRW_LOCK_ARGS2, SRW_LOCK_CALL2

9802529

These macros shall be used if there are other arguments. Unlike the old variants, they start with a comma.

MaxKellermann force-pushed the Scope_freeze_dict_sys branch from 1bd6f5a to afc52cf Compare September 20, 2024 06:40

dr-m self-assigned this Sep 20, 2024

MaxKellermann mentioned this pull request Sep 20, 2024

innobase/{lock0lock,dict0dict}: add noexcept to lock/unlock methods #3531

Merged

5 tasks

MaxKellermann changed the title ~~storage/innobase/dict/dict0dict: add a RAII class to freeze a dict_sys_t~~ MDEV-34986: storage/innobase/dict/dict0dict: add a RAII class to freeze a dict_sys_t Sep 23, 2024

cvicentiu added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MDEV-34986: storage/innobase/dict/dict0dict: add a RAII class to freeze a dict_sys_t #3529

MDEV-34986: storage/innobase/dict/dict0dict: add a RAII class to freeze a dict_sys_t #3529

MaxKellermann commented Sep 19, 2024 •

edited

Loading

CLAassistant commented Sep 19, 2024 •

edited

Loading

MaxKellermann commented Sep 20, 2024

dr-m commented Sep 20, 2024

MaxKellermann commented Sep 20, 2024

MaxKellermann commented Sep 20, 2024

MaxKellermann commented Sep 20, 2024

dr-m commented Sep 20, 2024

MaxKellermann commented Sep 20, 2024 •

edited

Loading

dr-m commented Sep 20, 2024

MaxKellermann commented Sep 23, 2024

svoj commented Dec 3, 2024 •

edited

Loading

DaveGosselin-MariaDB commented Dec 10, 2024

dr-m commented Dec 10, 2024

svoj commented Dec 10, 2024 •

edited

Loading

DaveGosselin-MariaDB commented Dec 10, 2024 •

edited

Loading

dr-m commented Mar 14, 2025

svoj commented Mar 14, 2025

MDEV-34986: storage/innobase/dict/dict0dict: add a RAII class to freeze a dict_sys_t #3529

Are you sure you want to change the base?

MDEV-34986: storage/innobase/dict/dict0dict: add a RAII class to freeze a dict_sys_t #3529

Conversation

MaxKellermann commented Sep 19, 2024 • edited Loading

Description

Release Notes

How can this PR be tested?

Basing the PR against the correct MariaDB version

PR quality check

CLAassistant commented Sep 19, 2024 • edited Loading

MaxKellermann commented Sep 20, 2024

dr-m commented Sep 20, 2024

MaxKellermann commented Sep 20, 2024

MaxKellermann commented Sep 20, 2024

MaxKellermann commented Sep 20, 2024

dr-m commented Sep 20, 2024

MaxKellermann commented Sep 20, 2024 • edited Loading

dr-m commented Sep 20, 2024

MaxKellermann commented Sep 23, 2024

svoj commented Dec 3, 2024 • edited Loading

DaveGosselin-MariaDB commented Dec 10, 2024

dr-m commented Dec 10, 2024

svoj commented Dec 10, 2024 • edited Loading

DaveGosselin-MariaDB commented Dec 10, 2024 • edited Loading

dr-m commented Mar 14, 2025

svoj commented Mar 14, 2025

MaxKellermann commented Sep 19, 2024 •

edited

Loading

CLAassistant commented Sep 19, 2024 •

edited

Loading

MaxKellermann commented Sep 20, 2024 •

edited

Loading

svoj commented Dec 3, 2024 •

edited

Loading

svoj commented Dec 10, 2024 •

edited

Loading

DaveGosselin-MariaDB commented Dec 10, 2024 •

edited

Loading