Skip to content

Deadlock during reload #10274

@TimeToogo

Description

@TimeToogo

We encountered a case where fluent-bit deadlocked during a reload (SIGHUP via the api).

Backtrace

(gdb) bt
#0  futex_wait (private=0, expected=2, futex_word=0x7fd3d5a00740 <tzset_lock>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait_private (futex=futex@entry=0x7fd3d5a00740 <tzset_lock>) at lowlevellock.c:35
#2  0x00007fd3d5909c1c in __tz_convert (timer=1745935171, use_localtime=1, tp=0x7ffc9a3aefd0) at tzset.c:572
#3  0x000000000054108d in flb_log_construct ()
#4  0x000000000054140c in flb_log_print ()
#5  0x00000000005787b2 in flb_reload ()
#6  0x00000000004b152f in flb_main ()
#7  0x00007fd3d583fee0 in __libc_start_call_main (main=main@entry=0x4af110 <main>, argc=argc@entry=3, argv=argv@entry=0x7ffc9a3b03d8) at ../sysdeps/nptl/libc_start_call_main.h:58
#8  0x00007fd3d583ff90 in __libc_start_main_impl (main=0x4af110 <main>, argc=3, argv=0x7ffc9a3b03d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc9a3b03c8) at ../csu/libc-start.c:389
#9  0x00000000004af155 in _start ()

Environment

  • Amazon Linux 2023 6.1.132-147.221.amzn2023.x86_64
  • Glibc 2.34
  • Fluent Bit v4.0.1

After post-mortem debugging an instance of this, it seems the a possible culprit is using localtime (AS-Unsafe) in a signal handler.

cur = localtime(&now);

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions