Description
Error.h and the base Error classes within have a long-standing fundamental design issue that results in a constant stream of ODR issues linking llvm in different ways that often blow in subtle runtime bugs. Some of these are also caught in ASAN builds of LLVM.
There is a 10-year-old smelly design pattern used across much of Error.h for comparing the memory address of a static char ID
as a system of unique identity for error types.
Here is the current API design:
class LLVM_ABI ErrorInfoBase {
public:
...
// Returns the class ID for this type.
static const void *classID() { return &ID; }
...
// Check whether this instance is a subclass of the class identified by
// ClassID.
virtual bool isA(const void *const ClassID) const {
return ClassID == classID();
}
// Check whether this instance is a subclass of ErrorInfoT.
template <typename ErrorInfoT> bool isA() const {
return isA(ErrorInfoT::classID());
}
private:
...
static char ID;
The problem here is depending on how you compile, link, and use LLVM in various configurations, using static data members in a header, especially for address-based identity, often fails. The compiler is allowed to inline this static member into every implementation it compiles. It's why C++17 added static inline for this case but here that would be the opposite of what we want. While this pattern is fast and probably fine if you are fully statically compiling (in most cases) against LLVM, it is pretty brittle in certain static linking setups and very broken in most dynamic link setups against llvm.
In our use cases in modular, this breaks with how we statically link llvm/Support across multiple dylibs and use them together resulting in runtime ODR bugs comparing Error types, but it's also broken by how lldb builds upstream with how liblldb.so/dylib (which links llvm/Support) and how frontend tools (like lldb and lldb-dap) and lldb plugins all link the liblldb dso and also link llvm/Support directly themselves (statically or dynamically).
As LLVM becomes more friendly with dynamic linking, these Error classes should probably be redesigned around how they handle unique identity comparisons.
I can't think of a simple fix because whatever unique identity is used, it has to be safe across different linking setups and there are many and downstream users are doing unique things as well (like us) so it shouldn't prescribe a specific setup required with how it's linked to be able to function properly. If you want to use a memory comparison, any dynamic or static linker has to always have visibility across different linker setups and has to be able to de-dup. It might make sense to move a string comparison, maybe containing a GUID or some namespace-qualified string for global uniqueness of error subclasses?