-
Notifications
You must be signed in to change notification settings - Fork 30
add mutex per interpreter per thread #698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #698 +/- ##
==========================================
+ Coverage 79.74% 80.29% +0.55%
==========================================
Files 9 9
Lines 3955 4081 +126
==========================================
+ Hits 3154 3277 +123
- Misses 801 804 +3
... and 1 file with indirect coverage changes
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There were too many comments to post at once. Showing the first 10 out of 13. Check the log or trigger a new build to see more.
lib/CppInterOp/CppInterOp.cpp
Outdated
namespace { | ||
struct LockRAII { | ||
LockRAII(std::mutex& resource_lock, std::mutex& thread_id_lock, | ||
std::optional<std::thread::id>& current_thread_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "std::optional" is directly included [misc-include-cleaner]
lib/CppInterOp/CppInterOp.cpp:40:
- #if CLANG_VERSION_MAJOR >= 19
+ #include <optional>
+ #if CLANG_VERSION_MAJOR >= 19
lib/CppInterOp/CppInterOp.cpp
Outdated
namespace { | ||
struct LockRAII { | ||
LockRAII(std::mutex& resource_lock, std::mutex& thread_id_lock, | ||
std::optional<std::thread::id>& current_thread_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "std::thread" is directly included [misc-include-cleaner]
lib/CppInterOp/CppInterOp.cpp:40:
- #if CLANG_VERSION_MAJOR >= 19
+ #include <thread>
+ #if CLANG_VERSION_MAJOR >= 19
lib/CppInterOp/CppInterOp.cpp
Outdated
ThreadIdLock.lock(); | ||
if (!ResourceLock.try_lock()) { | ||
if (CurrentThreadId) { | ||
if ((*CurrentThreadId) != std::this_thread::get_id()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "std::this_thread::get_id" is directly included [misc-include-cleaner]
if ((*CurrentThreadId) != std::this_thread::get_id()) {
^
lib/CppInterOp/CppInterOp.cpp
Outdated
if ((*CurrentThreadId) != std::this_thread::get_id()) { | ||
ThreadIdLock.unlock(); | ||
ResourceLock.lock(); // blocking | ||
bool res = ThreadIdLock.try_lock(); // should be free |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: Value stored to 'res' during its initialization is never read [clang-analyzer-deadcode.DeadStores]
bool res = ThreadIdLock.try_lock(); // should be free
^
Additional context
lib/CppInterOp/CppInterOp.cpp:106: Value stored to 'res' during its initialization is never read
bool res = ThreadIdLock.try_lock(); // should be free
^
lib/CppInterOp/CppInterOp.cpp
Outdated
if ((*CurrentThreadId) != std::this_thread::get_id()) { | ||
ThreadIdLock.unlock(); | ||
ResourceLock.lock(); // blocking | ||
bool res = ThreadIdLock.try_lock(); // should be free |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: unused variable 'res' [clang-diagnostic-unused-variable]
bool res = ThreadIdLock.try_lock(); // should be free
^
lib/CppInterOp/CppInterOp.cpp
Outdated
if (!owned) | ||
return; | ||
ThreadIdLock.lock(); | ||
CurrentThreadId = std::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "std::nullopt" is directly included [misc-include-cleaner]
CurrentThreadId = std::nullopt;
^
@@ -125,8 +193,17 @@ | |||
|
|||
// std::deque avoids relocations and calling the dtor of InterpreterInfo. | |||
static llvm::ManagedStatic<std::deque<InterpreterInfo>> sInterpreters; | |||
static std::mutex InterpreterStackLock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: variable 'InterpreterStackLock' is non-const and globally accessible, consider making it const [cppcoreguidelines-avoid-non-const-global-variables]
static std::mutex InterpreterStackLock;
^
lib/CppInterOp/CppInterOp.cpp
Outdated
@@ -448,8 +540,9 @@ | |||
return nullptr; | |||
} | |||
|
|||
std::vector<TCppScope_t> GetEnumConstants(TCppScope_t handle) { | |||
auto* D = (clang::Decl*)handle; | |||
std::vector<TCppScope_t> GetEnumConstants(TCppScope_t scope) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: no header providing "std::vector" is directly included [misc-include-cleaner]
lib/CppInterOp/CppInterOp.cpp:40:
- #if CLANG_VERSION_MAJOR >= 19
+ #include <vector>
+ #if CLANG_VERSION_MAJOR >= 19
lib/CppInterOp/CppInterOp.cpp
Outdated
auto* D = (clang::Decl*)handle; | ||
std::vector<TCppScope_t> GetEnumConstants(TCppScope_t scope) { | ||
LOCK(getInterpInfo()); | ||
auto* D = (clang::Decl*)scope; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: do not use C-style cast to convert between unrelated types [cppcoreguidelines-pro-type-cstyle-cast]
auto* D = (clang::Decl*)scope;
^
lib/CppInterOp/CppInterOp.cpp
Outdated
return nullptr; | ||
|
||
auto* D = (clang::Decl*)handle; | ||
auto* D = (clang::Decl*)scope; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: do not use C-style cast to convert between unrelated types [cppcoreguidelines-pro-type-cstyle-cast]
auto* D = (clang::Decl*)scope;
^
lib/CppInterOp/CppInterOp.cpp
Outdated
struct LockRAII { | ||
LockRAII(std::mutex& resource_lock, std::mutex& thread_id_lock, | ||
std::optional<std::thread::id>& current_thread_id) | ||
: ResourceLock(resource_lock), ThreadIdLock(thread_id_lock), | ||
CurrentThreadId(current_thread_id) { | ||
ThreadIdLock.lock(); | ||
if (!ResourceLock.try_lock()) { | ||
if (CurrentThreadId) { | ||
if ((*CurrentThreadId) != std::this_thread::get_id()) { | ||
ThreadIdLock.unlock(); | ||
ResourceLock.lock(); // blocking | ||
bool res = ThreadIdLock.try_lock(); // should be free | ||
assert(res && "Internal Error: Interpreter lock not held, but " | ||
"ThreadId lock held.\nPlease report this as a bug.\n"); | ||
ThreadIdLock.unlock(); | ||
return; | ||
} | ||
owned = false; | ||
ThreadIdLock.unlock(); | ||
return; | ||
} | ||
assert(false && | ||
"Internal Error: A thread holds lock for the current " | ||
"interpreter, but no information available on which thread " | ||
"holds the lock.\nPlease report this as a bug.\n"); | ||
} | ||
assert(!CurrentThreadId && | ||
"Internal Error: Lock released but thread id not cleared.\nPlease " | ||
"report this as a bug.\n"); | ||
CurrentThreadId = std::this_thread::get_id(); | ||
ThreadIdLock.unlock(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This complicated code allows us to lock the same mutex multiple times, provided it is the same thread.
Example:
void foo() {
RAII lock(mutex);
clang::WriteToAST();
}
void bar() {
RAII lock(mutex);
clang::WriteToAST_2();
foo(); // foo will try to lock mutex that is already locked. blocking the thread completely.
}
int main() { bar(); }
With this change, we check which thread is trying to lock the mutex, and if this thread previously locked the mutex, then it is safe to use without the second lock.
lib/CppInterOp/CppInterOp.cpp
Outdated
if (GpuArgs.empty()) | ||
I->declare(R"( | ||
namespace __internal_CppInterOp { | ||
template <typename Signature> | ||
struct function; | ||
template <typename Res, typename... ArgTypes> | ||
struct function<Res(ArgTypes...)> { | ||
typedef Res result_type; | ||
}; | ||
} // namespace __internal_CppInterOp | ||
)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to myself: Did not mean to commit this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I remember asking why in the non-GPU version... Please move the fix into another separate PR.
include/CppInterOp/CppInterOp.h
Outdated
@@ -643,7 +643,7 @@ CPPINTEROP_API TCppType_t GetCanonicalType(TCppType_t type); | |||
|
|||
/// Used to either get the built-in type of the provided string, or | |||
/// use the name to lookup the actual type. | |||
CPPINTEROP_API TCppType_t GetType(const std::string& type); | |||
CPPINTEROP_API TCppType_t GetType(const std::string& name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move these seemingly unrelated changes in a separate PR?
@@ -224,11 +301,18 @@ std::string Demangle(const std::string& mangled_name) { | |||
return demangle; | |||
} | |||
|
|||
void EnableDebugOutput(bool value /* =true*/) { llvm::DebugFlag = value; } | |||
void EnableDebugOutput(bool value /* =true*/) { | |||
LOCK(getInterpInfo()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a different mental model for this. I was thinking that we can have a StateMutatingSection
RAII object which in turn has a mutex-based locking (if multithreading was enabled) and a 'PushTransactionRAII` (for pch/modules). The object should take a parameter if the change is in clang AST which might trigger a derserialiazation or llvm-only. In the latter case we can skip pushing a transaction.
There will be cases where we need one of the two concepts and not both. We need to design a flexible solution addressing these needs without having to introduce both concepts at the same time when they are not needed.
@@ -125,8 +193,17 @@ struct InterpreterInfo { | |||
|
|||
// std::deque avoids relocations and calling the dtor of InterpreterInfo. | |||
static llvm::ManagedStatic<std::deque<InterpreterInfo>> sInterpreters; | |||
static std::mutex InterpreterStackLock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No statics. This needs to go in the InterpreterInfo class as we will lock per interpreter. That won't be sufficient if the interfaces are changing the llvm global state as done in EnableDebugOutput
. Fortunately the llvm global state is not a lot and maybe we can figure out how to address that on a later stage...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the lock for sInterpreters
. We will need to lock this sInterpreters
stack whenever we access it. This should be a global lock. Cpp::CreateInterpreter
pushes the Cpp::Interpreter
into sInterpreters
. And if two threads call Cpp::CreateInterpreter
at the same time, then the insertion should be atomic.
Similarly, we can have another global LLVM lock to take care of the EnableDebugOutput
cases.
Any comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point -- did not realize that...
lib/CppInterOp/CppInterOp.cpp
Outdated
if (auto* CXXRD = llvm::dyn_cast_or_null<CXXRecordDecl>(D)) | ||
return CXXRD->hasDefaultConstructor(); | ||
|
||
return false; | ||
} | ||
|
||
TCppFunction_t GetDefaultConstructor(compat::Interpreter& interp, | ||
TCppScope_t scope) { | ||
static TCppFunction_t GetDefaultConstructor(compat::Interpreter& interp, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be a separate PR.
if (result.empty()) | ||
Cpp::Declare("#include <utility>"); | ||
{ | ||
LOCK(getInterpInfo()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lock should be done in Declare
. The next 2-3 lines should not be changing the compiler state (unless modules are involved)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should lock the AST when we do any named lookups. What if some other thread is writing to the AST at the same time? These things should be atomic. Either we finish the read and perform the write, or finish the write and then perform the read. We cannot be writing and reading the symbol table at the same time.
But say we want to check the number of arguments of a function, that would not change once the function is parsed. So we can read that data without any locks.
DeclarationName DMove = &getASTContext().Idents.get("move"); | ||
auto result = getSema().getStdNamespace()->lookup(DMove); | ||
if (result.empty()) | ||
Cpp::Declare("#include <utility>"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should go in the special runtime header we will include in the interpreter initialization time... Which in turn will end up forcing us to implement our our std::move-like construct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do this in the same PR, but as a different commit. We can rebase and merge.
The thing is: We need to compute the path to this initialization file/header. And this should succeed with make install
and without the install
step too. So developers can use it without the install
step.
Let me know if you want a different PR, or if you are fine with me doing it in this PR as a different commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is pretty similar to the clang resouce header files, eg. stddef.h
. It is okay to require such a runtime header file -- we can delay that and hardcode it but ideally still needs to happen at the initialization time. The real challenge is testing when/if this goes to llvm -- we cannot have a runtime that depends on stl.
Let's move that discussion in a separate PR.
lib/CppInterOp/CppInterOp.cpp
Outdated
if (GpuArgs.empty()) | ||
I->declare(R"( | ||
namespace __internal_CppInterOp { | ||
template <typename Signature> | ||
struct function; | ||
template <typename Res, typename... ArgTypes> | ||
struct function<Res(ArgTypes...)> { | ||
typedef Res result_type; | ||
}; | ||
} // namespace __internal_CppInterOp | ||
)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I remember asking why in the non-GPU version... Please move the fix into another separate PR.
b689053
to
7659fcb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
@@ -125,8 +133,17 @@ struct InterpreterInfo { | |||
|
|||
// std::deque avoids relocations and calling the dtor of InterpreterInfo. | |||
static llvm::ManagedStatic<std::deque<InterpreterInfo>> sInterpreters; | |||
static std::mutex InterpreterStackLock; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: variable 'sInterpreters' is non-const and globally accessible, consider making it const [cppcoreguidelines-avoid-non-const-global-variables]
static llvm::ManagedStatic<std::deque<InterpreterInfo>> sInterpreters;
^
@@ -125,8 +133,17 @@ | |||
|
|||
// std::deque avoids relocations and calling the dtor of InterpreterInfo. | |||
static llvm::ManagedStatic<std::deque<InterpreterInfo>> sInterpreters; | |||
static std::mutex InterpreterStackLock; | |||
|
|||
static InterpreterInfo& getInterpInfo() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: variable 'InterpreterStackLock' is non-const and globally accessible, consider making it const [cppcoreguidelines-avoid-non-const-global-variables]
static std::mutex InterpreterStackLock;
^
@@ -350,8 +379,10 @@ | |||
|
|||
bool IsAbstract(TCppType_t klass) { | |||
auto* D = (clang::Decl*)klass; | |||
if (auto* CXXRD = llvm::dyn_cast_or_null<clang::CXXRecordDecl>(D)) | |||
if (auto* CXXRD = llvm::dyn_cast_or_null<clang::CXXRecordDecl>(D)) { | |||
LOCK(getInterpInfo()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: do not use C-style cast to convert between unrelated types [cppcoreguidelines-pro-type-cstyle-cast]
auto* D = (clang::Decl*)klass;
^
@@ -451,6 +483,7 @@ | |||
std::vector<TCppScope_t> GetEnumConstants(TCppScope_t handle) { | |||
auto* D = (clang::Decl*)handle; | |||
|
|||
LOCK(getInterpInfo()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: do not use C-style cast to convert between unrelated types [cppcoreguidelines-pro-type-cstyle-cast]
auto* D = (clang::Decl*)handle;
^
@@ -822,6 +861,9 @@ | |||
return -1; | |||
CXXRecordDecl* DCXXRD = cast<CXXRecordDecl>(DD); | |||
CXXRecordDecl* BCXXRD = cast<CXXRecordDecl>(BD); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a template cast to avoid duplicating the type name [modernize-use-auto]
auto* DCXXRD = cast<CXXRecordDecl>(DD); |
@@ -822,6 +861,9 @@ | |||
return -1; | |||
CXXRecordDecl* DCXXRD = cast<CXXRecordDecl>(DD); | |||
CXXRecordDecl* BCXXRD = cast<CXXRecordDecl>(BD); | |||
|
|||
LOCK(getInterpInfo()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a template cast to avoid duplicating the type name [modernize-use-auto]
LOCK(getInterpInfo()); | |
auto* BCXXRD = cast<CXXRecordDecl>(BD); |
@@ -893,6 +936,7 @@ | |||
bool HasDefaultConstructor(TCppScope_t scope) { | |||
auto* D = (clang::Decl*)scope; | |||
|
|||
LOCK(getInterpInfo()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: do not use C-style cast to convert between unrelated types [cppcoreguidelines-pro-type-cstyle-cast]
auto* D = (clang::Decl*)scope;
^
@@ -1862,6 +1927,7 @@ | |||
if (!builtin.isNull()) | |||
return builtin.getAsOpaquePtr(); | |||
|
|||
LOCK(getInterpInfo()); | |||
auto* D = (Decl*)GetNamed(name, /* Within= */ 0); | |||
if (auto* TD = llvm::dyn_cast_or_null<TypeDecl>(D)) { | |||
return QualType(TD->getTypeForDecl(), 0).getAsOpaquePtr(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: argument name 'Within' in comment does not match parameter name 'parent' [bugprone-argument-comment]
auto* D = (Decl*)GetNamed(name, /* Within= */ 0);
^
Additional context
include/CppInterOp/CppInterOp.h:411: 'parent' declared here
TCppScope_t parent = nullptr);
^
lib/CppInterOp/CppInterOp.cpp:718: actual callee ('GetNamed') is declared here
TCppScope_t GetNamed(const std::string& name,
^
@@ -3863,6 +3973,7 @@ | |||
|
|||
void GetOperator(TCppScope_t scope, Operator op, | |||
std::vector<TCppFunction_t>& operators, OperatorArity kind) { | |||
LOCK(getInterpInfo()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: The value '4294967295' provided to the cast expression is not in the valid range of values for 'OperatorArity' [clang-analyzer-optin.core.EnumCastOutOfRange]
return (OperatorArity)~0U;
^
Additional context
include/CppInterOp/CppInterOp.h:94: enum declared here
enum OperatorArity : unsigned char { kUnary = 1, kBinary, kBoth };
^
lib/CppInterOp/CppInterOp.cpp:3957: Assuming 'D' is not a 'CastReturnType'
if (auto* FD = llvm::dyn_cast<FunctionDecl>(D)) {
^
lib/CppInterOp/CppInterOp.cpp:3957: 'FD' is null
if (auto* FD = llvm::dyn_cast<FunctionDecl>(D)) {
^
lib/CppInterOp/CppInterOp.cpp:3957: Taking false branch
if (auto* FD = llvm::dyn_cast<FunctionDecl>(D)) {
^
lib/CppInterOp/CppInterOp.cpp:3975: The value '4294967295' provided to the cast expression is not in the valid range of values for 'OperatorArity'
return (OperatorArity)~0U;
^
Big changes in this PR. Some parts of the explanation might be like a conversation.
Mutex lock per interpreter.
Question: Do we need a per-interpreter lock?
Answer: Yes. Ideally, that would improve the performance in multi-threaded cases.
Question: Does the InterOp API fully support dispatching queries (some kind of lookup) or compiling code across multiple interpreters at the same time?
Answer: Actually, (I guess) No. Our API only supports stack-like access of multiple interpreters.
Example code that will not work:
But you can make the above example work, if the user rotates the
sInterpreters
.Example:
Question: Ok. So is there a way for InterOp is maintain the rotation thing?
Answer: Would not be a easy thing to achieve. From my initial view on the matter, it might be possible. But a more important question; Should we support such usecases?
Question: Ok. So given the limitation of using multiple interpreter but only as a stack-access. Do we need a mutex lock per interpreter?
Answer: No, the user can only access the top most interpreter at a time. We don't need a mutex per interpreter. (Let me know if my reasoning is wrong somewhere)
Testing this by running in parallel
gtest does not support running in parallel (I could not find anything with my google search). But there is this project, gtest-parallel, that can split the tasks into separate isolated processes. But that is not what we want. We want to split the tests to run in parallel threads that share the same interpreter instance.
Code recovery RAII
@vgvassilev, I need some pointers on incorporating the codegen error recovery RAII. I don't think we should combine the two into the same class, but let me know your opinion. And how should I go about doing it?