Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-34986: storage/innobase/dict/dict0dict: add a RAII class to freeze a dict_sys_t #3529

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

MaxKellermann
Copy link
Contributor

@MaxKellermann MaxKellermann commented Sep 19, 2024

https://jira.mariadb.org/browse/MDEV-34986

Description

This replaces a lot of manual freeze() and unfreeze() call. Doing it with RAII is safer and easier.

I did not replace all freeze()/unfreeze() pairs because some callers unfreeze and re-freeze in the middle of a scope. Sometimes, adding a new scope can be added just for such a RAII object. Refactoring that can be done later.

Notes:

  • Instead of using the global variable dict_sys, I decided to pass a
    dict_sys_t reference parameter, because I believe it will be
    necessary to eliminate that global variable eventually (in order to
    have a per-catalog instance). Hard-coding this global variable here
    would generate identical (not better) machine code and would be a
    step in the wrong direction.

  • The new macros SRW_LOCK_ARGS2 and SRW_LOCK_CALL2 were necessary
    because this is the first time those debug-only parameters are
    forwarded from a function that has more parameters (i.e. the
    dict_sys_t reference).

Release Notes

Nothing. Internal code change only with no runtime effect.

How can this PR be tested?

No runtime effect.

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@CLAassistant
Copy link

CLAassistant commented Sep 19, 2024

CLA assistant check
All committers have signed the CLA.

These macros shall be used if there are other arguments.  Unlike the
old variants, they start with a comma.
This replaces a lot of manual freeze() and unfreeze() call.  Doing it
with RAII is safer and easier.

I did not replace all freeze()/unfreeze() pairs because some callers
unfreeze and re-freeze in the middle of a scope.  Sometimes, adding a
new scope can be added just for such a RAII object.  Refactoring that
can be done later.

Notes:

- Instead of using the global variable `dict_sys`, I decided to pass a
  `dict_sys_t` reference parameter, because I believe it will be
  necessary to eliminate that global variable eventually (in order to
  have a per-catalog instance).  Hard-coding this global variable here
  would generate identical (not better) machine code and would be a
  step in the wrong direction.

- The new macros `SRW_LOCK_ARGS2` and `SRW_LOCK_CALL2` were necessary
  because this is the first time those debug-only parameters are
  forwarded from a function that has more parameters (i.e. the
  dict_sys_t reference).
@MaxKellermann
Copy link
Contributor Author

CI failures due to a dumb typo; I force-pushed an amended version.

@dr-m dr-m self-assigned this Sep 20, 2024
@dr-m
Copy link
Contributor

dr-m commented Sep 20, 2024

Thank you for your contribution. It was great to hear your related talk in Berlin last Tuesday. It is a pity that we did not have a chance to talk after that. I did not have a chance to check the related iovisor/bcc#5108 yet.

With @montywi’s current (not yet published) implementation of catalogs, there will continue to be a single global InnoDB infrastructure, which would include the singleton dict_sys object for which I removed pointer indirection in 5fd7502.

For catalogs or multi-tenancy to make any sense, I think that at the minimum, the buffer pool must be shared. I understood that implementations in other RDBMS may have per-tenant file structure, such as separate write-ahead, transaction metadata and data dictionary for each tenant. That kind of a solution would require pushing down a catalog or tenant ID to the lower level, such as the buffer pool and mini-transactions. It could incur a considerable overhead for the case that there is only a single tenant or catalog. It would also be more complex to implement than the what @montywi currently has shared with your organization.

Long story short, I think that the singleton dict_sys object is there to stay.

The bug that was featured in your talk would not have been prevented by this refactoring. The innodb_sync_debug that was removed in db006a9 and ff5d306 might have. 5f2dcd1 improved the situation a little, by allowing us to keep track of all srw_lock::rd_lock() holders and to add debug assertions that a particular latch is not being held in shared (or any) mode by the current thread.

The fix b2654ba moves some logic that used to be duplicated into a non-inline function lock_table_children(). The problem was not that we’d forget to release a latch; the problem was that we were holding a latch while entering a wait for a transactional lock, which may block for a very long time (by default, or 50 seconds), even indefinitely. All this because FOREIGN KEY constraints are being handled at a too low level, or because there are two metadata caches that are covered by slightly different locks: TABLE_SHARE and dict_table_t.

I think that there can be some benefit of having RAII around synchronization primitives. We are using it in b08448d and a variant of it in mtr_t::m_memo. But I am not fully convinced about the usefulness of this change. If we went for it, I would like to assess if it has any impact on the generated code.

@MaxKellermann
Copy link
Contributor Author

The bug that was featured in your talk would not have been prevented by this refactoring.

Of course not! This is just a source-code level change with no runtime effect. It helps keeping the code readable; and of course RAII is safer, less prone to human mistakes.

While MariaDB is formally written in C++, it could go much further to take advantage of C++'s features to make it safer. When I read the code to try to understand the dict_sys.latch problem last week, I kept wondering why it was that way (so many raw pointers, manual resource/lock management) - too error prone for my taste.
This PR is a tiny step in that direction; my suggestion to do more C++.

@MaxKellermann
Copy link
Contributor Author

If we went for it, I would like to assess if it has any impact on the generated code.

The generated code is mostly identical. In a perfect world, all of this would collapse to exactly identical machine code; but there are compiler imperfections.

For example, GCC is a bit clumsy about the trivial method ha_innobase::referenced_by_foreign_key() and adds one extra instruction for no obvious reason.

Other functions may grow larger because RAII gives you stronger guarantees; i.e. the generated code is a bit longer, but actually better than before. RAII guarantees that the destructor will be called even if an exception was thrown. That requires a tiny bit of extra code for stack unwinding.

People have different opinions about C++ exceptions. For my own code, I'm more than willing to pay the tiny price for this stack unwinding code (with overhead that is hardly even measurable). But fact about MariaDB is that C++ exceptions are not disabled, therefore you are already paying that price. Exceptions are indeed used in very few places, but 99% of the code misses noexcept declarations, which forces the compiler to make everything stack-unwinding-safe, even where it's never necessary. (Of course, adding noexcept does have disadvantages of its own. Oh yes, that perfect world, I'd like to have it.)

My point is: the code is not 100% identical, but it is either

  • better: due to stronger guarantees (that may have actual runtime effects because you now get exception-safety for free)
  • negligible overhead due to compiler clumsiness

@MaxKellermann
Copy link
Contributor Author

With this PR, the executable grows by 572 bytes.

But if I apply #3531 first, then this PR adds only 256 bytes, because the compiler can now omit some of the unnecessary stack unwinding safety.

@dr-m
Copy link
Contributor

dr-m commented Sep 20, 2024

Would the executable size grow less if you did not add any data member to your RAII guard, but instead made it refer directly to the singleton dict_sys, similar to how LockMutexGuard and LockGuard are referring to lock_sys.latch?

@MaxKellermann
Copy link
Contributor Author

MaxKellermann commented Sep 20, 2024

Would the executable size grow less if you did not add any data member to your RAII guard, but instead made it refer directly to the singleton dict_sys, similar to how LockMutexGuard and LockGuard are referring to lock_sys.latch?

In all cases I analyzed in the disassembly, this reference was optimized away by GCC, because GCC knew it was guaranteed to always point to the global variable.

There could be optimization pitfalls in other/similar cases, though: if this was not a global object, but a global pointer to an object, GCC would not be disallowed to optimize it away because the pointer value could change meanwhile. But in that case, the generated code would still be better because GCC would (likely) cache the pointer in a register instead of having to access RAM again to reload the pointer.

Edit: I just verified - the binary is exactly identical whether I use the global dict_sys variable or store a dict_sys_t reference in the RAII class.

@dr-m
Copy link
Contributor

dr-m commented Sep 20, 2024

Edit: I just verified - the binary is exactly identical whether I use the global dict_sys variable or store a dict_sys_t reference in the RAII class.

Thank you. With clang (which is the compiler on FreeBSD and macOS) it might be different, or maybe they have improved in recent releases.

By the way, while the dict_sys.latch is global, it is being used much less than its precursors dict_sys.mutex and dict_operation_lock used to be. In most cases, table metadata is being protected by reference counts or metadata locks (MDL, a lock on the name). Similarly, InnoDB data dictionary tables are being protected by transactional table locks.

The purpose of MDL or the old use of "global MDL" as in dict_sys.latch is to prevent a race condition between DDL and any concurrent operations on the table. Because the layer above InnoDB is largely ignorant about FOREIGN KEY, InnoDB is forced to acquire dict_sys.latch. Things could be much cleaner if the metadata storage was moved closer to the MDL logic, but implementing that is far from trivial.

@MaxKellermann MaxKellermann changed the title storage/innobase/dict/dict0dict: add a RAII class to freeze a dict_sys_t MDEV-34986: storage/innobase/dict/dict0dict: add a RAII class to freeze a dict_sys_t Sep 23, 2024
@MaxKellermann
Copy link
Contributor Author

I added a JIRA ticket. Do you want me to rebase this PR on a different branch?

@cvicentiu cvicentiu added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Nov 26, 2024
@svoj
Copy link
Contributor

svoj commented Dec 3, 2024

FWIW...

Apparently back a while ago I did write Marko:

I have mixed feeling about RAII approach in mutex locking, hope it won't expand further.

I got request to elaborate this quote here.

There certainly are good things that RAII locking has to offer:

  1. it allows not to bother with unlocking a primitive
  2. consequently it allows to save one code line that unlocks primitive
  3. and probably save a few more lines on handling variable that stores result of critical section

So here are problems that did make me come to this conclusion back then...

  1. they encourage bad habit of not controlling critical section size, which may cause potential deadlocks (e.g. if it accidentally spans over another primitive lock), and worsen scalability
  2. yes, in theory it can be avoided by adding code block {}, which (I think) doesn't make code prettier; and should you need to unlock mutex conditionally, the only option you have is goto out of the block
  3. should I need to find all primitive locks in a set of files, I can't just grep for e.g. rd_lock, I have to cover all wrappers (and yes, I consider dict_sys.freeze() as another evil)
  4. should I need to find all dict_sys.freeze() calls, I can't just grep for it, I'll have to find out that it has another wrapper and expand expression accordingly
  5. also it may have to call destructor for objects declared after the scoped lock itself, again negative scalability effect

All in all (IMHO) it gives more headache than good.

The only use case for RAII locking that I'd grudgingly agree with must meet the following requirements:

  1. locking primitive must be private member of a class
  2. no class method is allowed to return with primitive locked (of course, with RAII there's no other option)
  3. RAII should be used consistently, no direct locking primitive access
  4. method protected by RAII lock should be reasonably trivial, critical section size controllable
  5. in most cases there should be no other primitive locks within critical section

If you ask me, I'd put effort in getting rid of dict_sys.freeze() in the first place. E.g.

inline void UndorecApplier::apply_undo_rec(const trx_undo_rec_t *rec)
{
  ...
  dict_sys.freeze(SRW_LOCK_CALL);
  dict_table_t *table= dict_sys.find_table(table_id);
  dict_sys.unfreeze();
  ...
}

should either move freeze() inside find_table() or become dict_sys_t::find_table_with_lock() wrapper. In this case either fits my grudging requirement for RAII.

Another example:

dict_sys.freeze(SRW_LOCK_CALL);
table_stats= dict_acquire_mdl_shared<false>(table_stats,
                                            thd, &mdl_table);
dict_sys.unfreeze();

This should become member of dict_sys_t.

Good hunting!

@DaveGosselin-MariaDB
Copy link
Member

Hi @svoj , just a few thoughts to share:

There certainly are good things that RAII locking has to offer:

  1. it allows not to bother with unlocking a primitive

It saves the programmer from explicitly writing code to unlock a primitive, yet the primitive is still unlocked as intended by the programmer.

  1. consequently it allows to save one code line that unlocks primitive

This isn't the most important part of RAII, what really matters is consistency of lock-unlock pairs are preserved as part of the system design inherently.

  1. and probably save a few more lines on handling variable that stores result of critical section

I'm not sure what you meant here, can you please clarify?

So here are problems that did make me come to this conclusion back then...

  1. they encourage bad habit of not controlling critical section size, which may cause potential deadlocks (e.g. if it accidentally spans over another primitive lock), and worsen scalability
  2. yes, in theory it can be avoided by adding code block {}, which (I think) doesn't make code prettier; and should you need to unlock mutex conditionally, the only option you have is goto out of the block

Unfortunately, all programming constructs may be abused, nothing is foolproof. I'm not sure that this is a counterargument to using RAII. Code blocks are an intentional way of indicating the critical section and IIRC Facebook's folly library, for example, requires them to make explicit the critical section. We could model that approach in our solution and it would be clear if critical sections grew too large at sight. That's not possible with our current approach today.

  1. should I need to find all primitive locks in a set of files, I can't just grep for e.g. rd_lock, I have to cover all wrappers (and yes, I consider dict_sys.freeze() as another evil)
  2. should I need to find all dict_sys.freeze() calls, I can't just grep for it, I'll have to find out that it has another wrapper and expand expression accordingly

Both of these points are more of a tooling question as LSP + vim or LSP/eglot + emacs (and other tools) can easily find such references.

  1. also it may have to call destructor for objects declared after the scoped lock itself, again negative scalability effect

This sounds like a correctness problem in the original code; shouldn't destructors be called when objects go out of scope?

The only use case for RAII locking that I'd grudgingly agree with must meet the following requirements:

  1. locking primitive must be private member of a class
  2. no class method is allowed to return with primitive locked (of course, with RAII there's no other option)
  3. RAII should be used consistently, no direct locking primitive access
  4. method protected by RAII lock should be reasonably trivial, critical section size controllable
  5. in most cases there should be no other primitive locks within critical section

These are reasonable to me, with flexibility on (5) as you allowed already.

@dr-m
Copy link
Contributor

dr-m commented Dec 10, 2024

I would also like to point out that I am working on reducing contention on dict_sys.latch, and therefore the code around this is going to change. I am currently working on MDEV-35000. In MDEV-34999 you can find other tickets in this area. MDEV-34988 is what I believe originally motivated @MaxKellermann to look at this. I’m very thankful for that analysis and insight. At the moment, I don’t think that implementing a RAII helper will help with any of those fixes. It is basically syntactic sugar. We can return to this after those performance problems have been fixed.

One point that @svoj already made is that a RAII helper easily leads to releasing latches in LIFO order. Sometimes we try to release some easily contended latches in FIFO order in order to minimize the size of a critical section for a popular latch, such as dict_sys.latch, log_sys.latch, buf_pool.mutex. It is more likely that another thread is waiting for one of those, instead of a latch on a small object, such as buf_page_t::lock or dict_table_t::lock_latch. After all, other threads could be operating on entirely different tables or data pages.

@svoj
Copy link
Contributor

svoj commented Dec 10, 2024

@DaveGosselin-MariaDB,

...skip...

  1. and probably save a few more lines on handling variable that stores result of critical section

I'm not sure what you meant here, can you please clarify?

What I meant was:

  dict_index_t*   index;
  dict_sys.freeze(SRW_LOCK_CALL);
  index = dict_index_get_if_in_cache_low(index_id);
  dict_sys.unfreeze();
  return(index);

Here index is critical section result, with RAII it is not needed, one can just return dict_index_get_if_in_cache_low().

  1. should I need to find all primitive locks in a set of files, I can't just grep for e.g. rd_lock, I have to cover all wrappers (and yes, I consider dict_sys.freeze() as another evil)
  2. should I need to find all dict_sys.freeze() calls, I can't just grep for it, I'll have to find out that it has another wrapper and expand expression accordingly

Both of these points are more of a tooling question as LSP + vim or LSP/eglot + emacs (and other tools) can easily find such references.

Probably they can find such references and probably they will be more bullet-proof compared to grep. OTOH it can happen that in real world simplicity beats advanced constructs.

  1. also it may have to call destructor for objects declared after the scoped lock itself, again negative scalability effect

This sounds like a correctness problem in the original code; shouldn't destructors be called when objects go out of scope?

For contested locking primitives we may want to make critical section as small as possible. If destructors don't strictly need to be executed under protection, they should be moved out of critical section. In this case we have no control which destructor is called first. This is very similar to what @dr-m explained above.

Another question is if we can make use of std::lock_guard for the purpose of locking srw_lock and avoid adding custom classes?

@DaveGosselin-MariaDB
Copy link
Member

DaveGosselin-MariaDB commented Dec 10, 2024

What I meant was:

  dict_index_t*   index;
  dict_sys.freeze(SRW_LOCK_CALL);
  index = dict_index_get_if_in_cache_low(index_id);
  dict_sys.unfreeze();
  return(index);

Here index is critical section result, with RAII it is not needed, one can just return dict_index_get_if_in_cache_low().

Oh yes, this makes sense, thank you for clarifying.

For contested locking primitives we may want to make critical section as small as possible. If destructors don't strictly need to be executed under protection, they should be moved out of critical section. In this case we have no control which destructor is called first. This is very similar to what @dr-m explained above.

Understood, yes in general as little as necessary should be done in the critical section.

Another question is if we can make use of std::lock_guard for the purpose of locking srw_lock and avoid adding custom classes?

I would prefer this, I'm not sure how far back in terms of gcc/clang we need to support. Put another way, will all of our target toolchains support the standard of C++ required for std::lock_guard and friends? (If not, then we'd have to provide our own RAII of course)

@dr-m
Copy link
Contributor

dr-m commented Mar 14, 2025

Meanwhile, #3729 was merged and #3889 should soon remove some code duplication around dict_sys.latch. There are several other improvements around dict_sys.latch planned, as I explained in my previous comment. I’m thankful that @MaxKellermann brought the contention on dict_sys.latch to my attention. I hadn’t realized that it can be so severe in I/O bound workloads, because InnoDB always misbehaved in that way. In fact, before MariaDB Server 10.6 did away with dict_sys.mutex and replaced some use of dict_sys.latch (originally known as dict_operation_lock) with finer-grained metadata locks (MDL), it was even worse.

Exclusive dict_sys.latch acquisition should be so rare that it probably is not worth the effort to make dict_sys.latch compatible wtih std::lock_guard. C++14 would introduce the more appropriate std::shared_lock, but unfortunately our code base is still stuck with C++11.

We can keep this ticket open as a reminder to revisit the topic, once we no longer hold dict_sys.latch during any buffer pool access. Or we could close this pull request now and review the merit of MDEV-34986 later.

@svoj
Copy link
Contributor

svoj commented Mar 14, 2025

@dr-m if pull request is not intended to be accepted it should be closed. MDEV should be the reminder to revisit the topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.
Development

Successfully merging this pull request may close these issues.

6 participants