Skip to content

[LLDB] Deterministic module order in Target::SetExecutableModule #10746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: stable/20240723
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions lldb/include/lldb/Target/Target.h
Original file line number Diff line number Diff line change
Expand Up @@ -725,6 +725,50 @@ class Target : public std::enable_shared_from_this<Target>,
lldb::ModuleSP GetOrCreateModule(const ModuleSpec &module_spec, bool notify,
Status *error_ptr = nullptr);

/// Find a binary on the system and return its Module,
/// or return an existing Module that is already in the Target.
///
/// Given a ModuleSpec, find a binary satisifying that specification,
/// or identify a matching Module already present in the Target,
/// and return a shared pointer to it.
///
/// \param[in] module_spec
/// The criteria that must be matched for the binary being loaded.
/// e.g. UUID, architecture, file path.
///
/// \param[in] notify
/// If notify is true, and the Module is new to this Target,
/// Target::ModulesDidLoad will be called.
/// If notify is false, it is assumed that the caller is adding
/// multiple Modules and will call ModulesDidLoad with the
/// full list at the end.
/// ModulesDidLoad must be called when a Module/Modules have
/// been added to the target, one way or the other.
///
/// \param[in] scheduleSymbolsPreload
/// A function that schedules symbol preloading tasks. This allows
/// the caller to customize how symbol preloading is scheduled,
/// for example to run it in parallel. It's called with a callback that
/// will perform the actual symbol loading. The other overload of this
/// method without this parameter will perform symbol preloading
/// immediately in the calling thread.
///
/// \param[out] error_ptr
/// Optional argument, pointing to a Status object to fill in
/// with any results / messages while attempting to find/load
/// this binary. Many callers will be internal functions that
/// will handle / summarize the failures in a custom way and
/// don't use these messages.
///
/// \return
/// An empty ModuleSP will be returned if no matching file
/// was found. If error_ptr was non-nullptr, an error message
/// will likely be provided.
lldb::ModuleSP GetOrCreateModule(
const ModuleSpec &module_spec, bool notify,
llvm::function_ref<void(std::function<void()>)> scheduleSymbolsPreload,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a separate overload for this or could we use a default value of nullptr for scheduleSymbolsPreload? It looks like llvm::function_ref has a suitable operator bool() defined.

I suppose it might require updating more callsites that currently pass the Status output pointer.

Status *error_ptr = nullptr);

// Settings accessors

static TargetProperties &GetGlobalProperties();
Expand Down
102 changes: 37 additions & 65 deletions lldb/source/Target/Target.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1538,73 +1538,35 @@ void Target::SetExecutableModule(ModuleSP &executable_sp,
break;
}

if (executable_objfile && load_dependents) {
// FileSpecList is not thread safe and needs to be synchronized.
FileSpecList dependent_files;
std::mutex dependent_files_mutex;

// ModuleList is thread safe.
ModuleList added_modules;

auto GetDependentModules = [&](FileSpec dependent_file_spec) {
FileSpec platform_dependent_file_spec;
if (m_platform_sp)
m_platform_sp->GetFileWithUUID(dependent_file_spec, nullptr,
platform_dependent_file_spec);
else
platform_dependent_file_spec = dependent_file_spec;

ModuleSpec module_spec(platform_dependent_file_spec, m_arch.GetSpec());
ModuleSP image_module_sp(
GetOrCreateModule(module_spec, false /* notify */));
if (image_module_sp) {
added_modules.AppendIfNeeded(image_module_sp, false);
ObjectFile *objfile = image_module_sp->GetObjectFile();
if (objfile) {
// Create a local copy of the dependent file list so we don't have
// to lock for the whole duration of GetDependentModules.
FileSpecList dependent_files_copy;
{
std::lock_guard<std::mutex> guard(dependent_files_mutex);
dependent_files_copy = dependent_files;
}

// Remember the size of the local copy so we can append only the
// modules that have been added by GetDependentModules.
const size_t previous_dependent_files =
dependent_files_copy.GetSize();

objfile->GetDependentModules(dependent_files_copy);

{
std::lock_guard<std::mutex> guard(dependent_files_mutex);
for (size_t i = previous_dependent_files;
i < dependent_files_copy.GetSize(); ++i)
dependent_files.AppendIfUnique(
dependent_files_copy.GetFileSpecAtIndex(i));
}
}
}
};

executable_objfile->GetDependentModules(dependent_files);
if (!executable_objfile || !load_dependents)
return;

llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool());
for (uint32_t i = 0; i < dependent_files.GetSize(); i++) {
// Process all currently known dependencies in parallel in the innermost
// loop. This may create newly discovered dependencies to be appended to
// dependent_files. We'll deal with these files during the next
// iteration of the outermost loop.
{
std::lock_guard<std::mutex> guard(dependent_files_mutex);
for (; i < dependent_files.GetSize(); i++)
task_group.async(GetDependentModules,
dependent_files.GetFileSpecAtIndex(i));
}
task_group.wait();
llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool());
ModuleList added_modules;
FileSpecList dependent_files;
executable_objfile->GetDependentModules(dependent_files);

for (uint32_t i = 0; i < dependent_files.GetSize(); i++) {
auto dependent_file_spec = dependent_files.GetFileSpecAtIndex(i);
FileSpec platform_dependent_file_spec;
if (m_platform_sp)
m_platform_sp->GetFileWithUUID(dependent_file_spec, nullptr,
platform_dependent_file_spec);
else
platform_dependent_file_spec = dependent_file_spec;

ModuleSpec module_spec(platform_dependent_file_spec, m_arch.GetSpec());
auto image_module_sp = GetOrCreateModule(
module_spec, false /* notify */,
[&](auto preloadSymbols) { task_group.async(preloadSymbols); });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to gate the parallel loading of modules here based on the with the target.parallel-module-load setting.

if (image_module_sp) {
added_modules.AppendIfNeeded(image_module_sp, false);
if (auto *objfile = image_module_sp->GetObjectFile())
objfile->GetDependentModules(dependent_files);
}
ModulesDidLoad(added_modules);
}
task_group.wait();
ModulesDidLoad(added_modules);
}
}

Expand Down Expand Up @@ -2249,6 +2211,15 @@ bool Target::ReadPointerFromMemory(const Address &addr, Status &error,

ModuleSP Target::GetOrCreateModule(const ModuleSpec &module_spec, bool notify,
Status *error_ptr) {
return GetOrCreateModule(module_spec, notify, [](auto preloadSymbols) {
preloadSymbols();
}, error_ptr);
}

ModuleSP Target::GetOrCreateModule(
const ModuleSpec &module_spec, bool notify,
llvm::function_ref<void(std::function<void()>)> scheduleSymbolsPreload,
Status *error_ptr) {
ModuleSP module_sp;

Status error;
Expand Down Expand Up @@ -2408,7 +2379,8 @@ ModuleSP Target::GetOrCreateModule(const ModuleSpec &module_spec, bool notify,
// Preload symbols outside of any lock, so hopefully we can do this for
// each library in parallel.
if (GetPreloadSymbols())
module_sp->PreloadSymbols();
scheduleSymbolsPreload([module_sp] { module_sp->PreloadSymbols(); });

llvm::SmallVector<ModuleSP, 1> replaced_modules;
for (ModuleSP &old_module_sp : old_modules) {
if (m_images.GetIndexForModule(old_module_sp.get()) !=
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
C_SOURCES := main.c
LD_EXTRAS := -L. -la -lb -lc -ld -le
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into a few problems when running this test on linux.

  1. The -lc was confusing the locally build libc.so with the "real" libc and causing linking errors.
  2. The linked shared libraries did not use a full path so only the basename is stored in the DT_NEEDED section. This does not resolve to a unique file location so it they are skipped by the load dependent code.

I worked around the first problem by not linking libc.so into a.out. But then when running the test with trace enabled I see the following output.

First run module list: ['a.out']
Run 2 module list: ['a.out']
...

Obviously this is not going to detect a non-deterministic module order for linux. But if we change it to link to the full path of the shared objects

--- a/lldb/test/API/functionalities/deterministic_module_order/Makefile
+++ b/lldb/test/API/functionalities/deterministic_module_order/Makefile
@@ -1,9 +1,12 @@
 C_SOURCES := main.c
-LD_EXTRAS := -L. -la -lb -lc -ld -le
+include Makefile.rules
+
+# Use full path to shared libraries to make test effective on linux.
+LD_EXTRAS := $(BUILDDIR)/liba.so $(BUILDDIR)/libb.so $(BUILDDIR)/libc.so $(BUILDDIR)/libd.so $(BUILDDIR)/libe.so

 a.out: liba libb libc libd libe

-include Makefile.rules
+

Then the test looks like it would work

First run module list: ['a.out', 'liba.so', 'libb.so', 'libc.so', 'libd.so', 'libe.so']
Run 2 module list: ['a.out', 'liba.so', 'libb.so', 'libc.so', 'libd.so', 'libe.so']
...


a.out: liba libb libc libd libe

include Makefile.rules

liba:
"$(MAKE)" -f $(MAKEFILE_RULES) \
DYLIB_ONLY=YES DYLIB_C_SOURCES=lib.c DYLIB_NAME=a

libb:
"$(MAKE)" -f $(MAKEFILE_RULES) \
DYLIB_ONLY=YES DYLIB_C_SOURCES=lib.c DYLIB_NAME=b

libc:
"$(MAKE)" -f $(MAKEFILE_RULES) \
DYLIB_ONLY=YES DYLIB_C_SOURCES=lib.c DYLIB_NAME=c

libd:
"$(MAKE)" -f $(MAKEFILE_RULES) \
DYLIB_ONLY=YES DYLIB_C_SOURCES=lib.c DYLIB_NAME=d

libe:
"$(MAKE)" -f $(MAKEFILE_RULES) \
DYLIB_ONLY=YES DYLIB_C_SOURCES=lib.c DYLIB_NAME=e
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
"""
Test that module loading order is deterministic.
"""

import lldb
from lldbsuite.test.decorators import *
from lldbsuite.test.lldbtest import *
from lldbsuite.test import lldbutil


class DeterministicModuleOrderTestCase(TestBase):
NO_DEBUG_INFO_TESTCASE = True

def setUp(self):
TestBase.setUp(self)

def test_create_target_module_load_order(self):
"""Test that module loading order is deterministic across multiple runs."""

self.build()
exe = self.getBuildArtifact("a.out")

# First run: Create target and save module list
target = self.dbg.CreateTarget(exe)
self.assertTrue(target, VALID_TARGET)

first_run_modules = []
for i in range(target.GetNumModules()):
module = target.GetModuleAtIndex(i)
module_name = module.GetFileSpec().GetFilename()
first_run_modules.append(module_name)

self.dbg.DeleteTarget(target)

if self.TraceOn():
print(f"First run module list: {first_run_modules}")

if not first_run_modules:
self.fail("No modules were found during the first run")

# Subsequent runs: Create target and compare module list to first run
num_additional_runs = 9 # Total of 10 runs including the first one
for run in range(num_additional_runs):
target = self.dbg.CreateTarget(exe)
self.assertTrue(target, VALID_TARGET)
current_modules = []
for i in range(target.GetNumModules()):
module = target.GetModuleAtIndex(i)
module_name = module.GetFileSpec().GetFilename()
current_modules.append(module_name)

if self.TraceOn():
print(f"Run {run+2} module list: {current_modules}")

self.assertEqual(
first_run_modules,
current_modules,
f"Module list in run {run+2} differs from the first run"
)

self.dbg.DeleteTarget(target)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
void dummy_func(void) {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
int main() {
return 0;
}