Skip to content

Conversation

clayborg
Copy link
Collaborator

@clayborg clayborg commented Sep 8, 2025

llvm::StripTemplateParameters was used to add accelerator table entries to the DWARF accelerator tables by adding and entry for the template name without the template. There is a bug where if a string starts with a '<' character, this function would return an std::optional that was empty. This causes invalid entries to be added to __apple_XXXX accelerator tables where entries with empty strings would be added and were causing issues with the AppleAcceleratorTable::Iterator before the fix that was submitted (#157538).

llvm::StripTemplateParameters was used to add accelerator table entries to the DWARF accelerator tables by adding and entry for the template name without the template. There is a bug where if a string starts with a '<' character, this function would return an std::optional<StringRef> that was empty. This causes invalid entries to be added to __apple_XXXX accelerator tables where entries with empty strings would be added and were causing issues with the AppleAcceleratorTable::Iterator before the fix that was submitted (llvm#157538).
@llvmbot
Copy link
Member

llvmbot commented Sep 8, 2025

@llvm/pr-subscribers-debuginfo

Author: Greg Clayton (clayborg)

Changes

llvm::StripTemplateParameters was used to add accelerator table entries to the DWARF accelerator tables by adding and entry for the template name without the template. There is a bug where if a string starts with a '<' character, this function would return an std::optional<StringRef> that was empty. This causes invalid entries to be added to __apple_XXXX accelerator tables where entries with empty strings would be added and were causing issues with the AppleAcceleratorTable::Iterator before the fix that was submitted (#157538).


Full diff: https://github.com/llvm/llvm-project/pull/157553.diff

2 Files Affected:

  • (modified) llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp (+8-6)
  • (modified) llvm/unittests/DebugInfo/DWARF/DWARFAcceleratorTableTest.cpp (+14)
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
index ea336378bebb3..5e862e0c233e9 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
@@ -248,8 +248,8 @@ LLVM_DUMP_METHOD void AppleAcceleratorTable::dump(raw_ostream &OS) const {
     }
 
     for (unsigned HashIdx = Index; HashIdx < Hdr.HashCount; ++HashIdx) {
-      uint64_t HashOffset = HashesBase + HashIdx*4;
-      uint64_t OffsetsOffset = OffsetsBase + HashIdx*4;
+      uint64_t HashOffset = HashesBase + HashIdx * 4;
+      uint64_t OffsetsOffset = OffsetsBase + HashIdx * 4;
       uint32_t Hash = AccelSection.getU32(&HashOffset);
 
       if (Hash % Hdr.BucketCount != Bucket)
@@ -443,7 +443,7 @@ void DWARFDebugNames::Header::dump(ScopedPrinter &W) const {
 }
 
 Error DWARFDebugNames::Header::extract(const DWARFDataExtractor &AS,
-                                             uint64_t *Offset) {
+                                       uint64_t *Offset) {
   auto HeaderError = [Offset = *Offset](Error E) {
     return createStringError(errc::illegal_byte_sequence,
                              "parsing .debug_names header at 0x%" PRIx64 ": %s",
@@ -830,8 +830,9 @@ bool DWARFDebugNames::NameIndex::dumpEntry(ScopedPrinter &W,
   uint64_t EntryId = *Offset;
   auto EntryOr = getEntry(Offset);
   if (!EntryOr) {
-    handleAllErrors(EntryOr.takeError(), [](const SentinelError &) {},
-                    [&W](const ErrorInfoBase &EI) { EI.log(W.startLine()); });
+    handleAllErrors(
+        EntryOr.takeError(), [](const SentinelError &) {},
+        [&W](const ErrorInfoBase &EI) { EI.log(W.startLine()); });
     return false;
   }
 
@@ -1117,7 +1118,8 @@ std::optional<StringRef> llvm::StripTemplateParameters(StringRef Name) {
   //
   // We look for > at the end but if it does not contain any < then we
   // have something like operator>>. We check for the operator<=> case.
-  if (!Name.ends_with(">") || Name.count("<") == 0 || Name.ends_with("<=>"))
+  if (Name.starts_with("<") || !Name.ends_with(">") || Name.count("<") == 0 ||
+      Name.ends_with("<=>"))
     return {};
 
   // How many < until we have the start of the template parameters.
diff --git a/llvm/unittests/DebugInfo/DWARF/DWARFAcceleratorTableTest.cpp b/llvm/unittests/DebugInfo/DWARF/DWARFAcceleratorTableTest.cpp
index dedcf816cf63f..2265ff9dd42be 100644
--- a/llvm/unittests/DebugInfo/DWARF/DWARFAcceleratorTableTest.cpp
+++ b/llvm/unittests/DebugInfo/DWARF/DWARFAcceleratorTableTest.cpp
@@ -299,4 +299,18 @@ TEST(DWARFDebugNames, UnsupportedForm) {
       Sections,
       FailedWithMessage("unsupported Form for YAML debug_names emitter"));
 }
+
+TEST(DWARFDebugNames, TestStripTemplateParameters) {
+
+  std::optional<StringRef> stripped_name;
+  // Make sure we can extract the name "foo" from the template parameters.
+  stripped_name = StripTemplateParameters("foo<int>");
+  ASSERT_TRUE(stripped_name.has_value());
+  ASSERT_EQ(*stripped_name, StringRef("foo"));
+  // Make sure that we don't get an empty name back when the string starts with
+  // '<'.
+  stripped_name = StripTemplateParameters("<int>");
+  ASSERT_FALSE(stripped_name.has_value());
+}
+
 } // end anonymous namespace

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would cause us to try insert things like <int> into the accelerator table?

@clayborg
Copy link
Collaborator Author

clayborg commented Sep 9, 2025

What would cause us to try insert things like <int> into the accelerator table?

We have a kotlin compiler that is JITing C code and it is emitting a ton of names that start with "<" like <get-size> and many more. When this happens the iterator (fix in #157538) causes many minutes of performance penalty over and over each time we fetch frame variable and request global variables. The problem is this causes dsymutil to emit a __apple_names table with an entry for the empty string. The iterator wasn't using the offsets table to access each item and when dsymutil emits __apple_names tables with empty strings, the iterator would start parsing the entry, get an 32 bit string offset to the empty string and stop parsing. Then it would continue trying to parse the entries and get the number of entries as the string value and then parse the first DIE offset as the count. For large DWARF files this would cause the iterator code to try and parse millions of entries every times we try to get globals with local variables and cause minutes of delays.

@Michael137
Copy link
Member

Thanks for the context

Can we instead just avoid inserting empty names into the index? That seems like a generally useful thing to do. I find it slightly awkward to support names that aren't legal in C++ in a function called StripTemplateParameters.

Copy link
Contributor

@felipepiovezan felipepiovezan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you ran clang-format on the entire file by accident? Please revert the other changes

stripped_name = StripTemplateParameters("foo<int>");
ASSERT_TRUE(stripped_name.has_value());
ASSERT_EQ(*stripped_name, StringRef("foo"));
// Make sure that we don't get an empty name back when the string starts with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment says "we DON'T get an empty name", but the assert seems to be the opposite?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworded the comment to indicate we expect to not get a string back. Previously we were getting an empty string back from this call.

@Michael137
Copy link
Member

Instead of adding this special-case, could we just check if the stripped name is empty, and if so return std::nullopt? That we guarantee that this function returns std::nullopt or a non-empty string.

// We look for > at the end but if it does not contain any < then we
// have something like operator>>. We check for the operator<=> case.
if (!Name.ends_with(">") || Name.count("<") == 0 || Name.ends_with("<=>"))
if (Name.starts_with("<") || !Name.ends_with(">") || Name.count("<") == 0 ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this Name.starts_with("<") check anymore right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct. I can remove it.

@RKSimon RKSimon removed their request for review September 19, 2025 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants