Skip to content

feat: make logging machinery thread safe#3050

Open
bruno-dasilva wants to merge 2 commits into
beyond-all-reason:masterfrom
bruno-dasilva:bruno/thread-safe-logging
Open

feat: make logging machinery thread safe#3050
bruno-dasilva wants to merge 2 commits into
beyond-all-reason:masterfrom
bruno-dasilva:bruno/thread-safe-logging

Conversation

@bruno-dasilva

@bruno-dasilva bruno-dasilva commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

Current logging machinery is only threadsafe because all init is done single-threaded at startup, and threads ONLY collide on the POSIX fprintf (which locks internally).

With #2954, init machinery can now run at any time, so we need to guard logger callsites.

This would be very easy, except there's some complexity with thread freezing to print stacktraces since the thread could be holding the lock while frozen.

I ran some basic testing in running scenarios/skirmish in both headful and headless and everything seemed fine.

AI Disclosure

My fingers/brain + Claude Code for some finer details

@bruno-dasilva bruno-dasilva requested a review from sprunk June 21, 2026 00:31
@bruno-dasilva bruno-dasilva marked this pull request as ready for review June 21, 2026 11:58
@sprunk

sprunk commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

How does this relate to #2840? That one also claims to make logging threadsafe but is a 1-liner and has tests.

@bruno-dasilva

Copy link
Copy Markdown
Collaborator Author

How does this relate to #2840? That one also claims to make logging threadsafe but is a 1-liner and has tests.

ivand's PR is much simpler but:

  • doesn't protect file sink which is relevant to Add ClearLog functionality for log management #2954
  • doesn't protect and other config fns which aren't relevant to today's implementations but makes it fully thread safe
  • doesn't protect against the logging-on-error-path/crash handler problem where we can freeze a thread mid-logging and then be blocked forever

We don't need to protect every logging function with a guard for #2954. Just the file sink + log write. So we could eg. remove the guard setMinLevel(). But again, this adds proper thread safety across the board.

  • You'd still need the rest of the complexity regardless 🙃

tests are a good idea.

Add three Catch2 cases to TestILog.cpp that stress the logging
machinery concurrently:
- ConcurrentRecordDispatch: N threads flooding records at mixed
  levels/sections
- ConcurrentSinkRegistryChurn: writers logging while a mutator churns
  the shared logFiles
- ConcurrentFilterChanges: writers logging on a section while another
  thread changes its filter level and re-registers it

Wire ThreadSanitizer:
- new USE_TSAN CMake option (whole-engine TSan). Similar to USE_ASAN.
- test_ILog is built with -fsanitize=thread by default on Linux
  so the cases can be run with `./test/test_ILog "[multithreaded]"`
  without a separate build tree.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@bruno-dasilva bruno-dasilva force-pushed the bruno/thread-safe-logging branch from a37b20e to a78fea0 Compare June 22, 2026 07:27
@bruno-dasilva

Copy link
Copy Markdown
Collaborator Author

added tests in a new commit.

If we have a desire to simplify this we'd need to:

  • cut whats protected, or
  • redesign of the logging api so logger has a single public interface that doesn't call itself (ie. a single locking surface)

@sprunk

sprunk commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

My main useful feedback is that AFAIK the current design for logging thread-safety is supposed to make LOG(x) not require a mutex to be safe, and all the overhead to make that possible is put in the other functions. This is because LOG(x) is called potentially all the time while the others are called almost never. I am not sure to what extent the current implementation actually follows this or if the constraints that allow it will remain with #2954, but it would be nice to keep if possible.

For reviewing the actual thread-safety it would probably best for @lostsquirrel1 to look.

@WybrenKoelmans

Copy link
Copy Markdown
Contributor

I'm no expert here, but have you considered instead of applying mutex, I was thinking maybe a thread-safe queue that will buffer logs. Then a log writer on the main thread that will read from the queue and actually do the file writes. Wouldn't that make it thread safe logging without having mutexes all over the place?

@bruno-dasilva

bruno-dasilva commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

I'm no expert here, but have you considered instead of applying mutex, I was thinking maybe a thread-safe queue that will buffer logs. Then a log writer on the main thread that will read from the queue and actually do the file writes. Wouldn't that make it thread safe logging without having mutexes all over the place?

yes what you're describing is async logging, where one thread owns the actual writes and other threads just append to a concurrent queue.

Genuine tradeoff there in different kind of complexity (you have to then deal with: what happens during a crash because you probably need to flush/print fatal logs synchronously, what happens if loggers add so many logs the queue outpaces how fast the writer thread can write, etc.). But that could be preferable to just tacking on a bunch of concurrency complexity onto the existing logger

in-discord discussion here: https://discord.com/channels/549281623154229250/724924957074915358/1518918926069928088

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants