-
Notifications
You must be signed in to change notification settings - Fork 2.2k
type_caster_generic: add set_foreign_holder method for subclasses to implement #5862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
17a1ff9
f971060
fa66511
8c760c4
524d54d
68c8f3a
1a90708
1ba6d64
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -863,6 +863,71 @@ struct holder_helper { | |||||||||||
| static auto get(const T &p) -> decltype(p.get()) { return p.get(); } | ||||||||||||
| }; | ||||||||||||
|
|
||||||||||||
| struct holder_caster_foreign_helpers { | ||||||||||||
| struct py_deleter { | ||||||||||||
| void operator()(const void *) const noexcept { | ||||||||||||
| // Don't run the deleter if the interpreter has been shut down | ||||||||||||
| if (Py_IsInitialized() == 0) { | ||||||||||||
|
||||||||||||
| #ifdef GRAALVM_PYTHON | |
| if (!Py_IsInitialized() || _Py_IsFinalizing()) { | |
| return; | |
| } | |
| #endif |
That's GraalPy-specific, but I was thinking it's correct in general.
I also assumed _Py_IsFinalizing() still works with all Python versions, although only Py_IsFinalizing() is actually a public API, since Python 3.9; but we are still supporting Python 3.8.
Everything else in this PR looks great to me, it's just this one line that I'm still worried about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I misread your suggestion as trying to allow decref during finalization. As written, the IsFinalizing term is redundant because IsFinalizing becomes true exactly when IsInitialized becomes false. From cpython/Python/pylifecycle.c:
_PyInterpreterState_SetFinalizing(tstate->interp, tstate);
_PyRuntimeState_SetFinalizing(runtime, tstate);
runtime->initialized = 0;
runtime->core_initialized = 0;
IsFinalizing checks _PyRuntimeState_GetFinalizing while IsInitialized checks runtime->initialized.
I assumed it's not OK. — How can we gain certainty?
The best documentation I know of what happens during finalization is hudson-trading/pymetabind#2 (comment) -- this is not an API guarantee / could change, but as the comment itself notes, that doesn't mean we don't have to interact with it. Pablo is a CPython core developer who has done a lot of work on the finalization process.
The higher-level reason to believe it's OK to decref during finalization is that Python itself decrefs during finalization all over the place! Python actually goes to fairly great lengths to attempt to ensure that there are no objects left alive when the interpreter exits. That requires a lot of individual references to be broken. There is an entire GC pass that occurs after IsFinalizing becomes true.
We can assume that it's still OK to decref when our capsule destructor is invoked because our capsule destructor was invoked by a decref (the one took the refcount of the capsule to zero).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have time to comb through the cpython code, so I fell back to asking ChatGPT again (same URL, one extra prompt/response):
https://chatgpt.com/share/68ed515d-82e4-8008-8954-784843385bf9
It explained why it still recommends also checking Py_IsFinalizing() although only after acquiring the GIL (it suggested that before but I overlooked the specific order before).
Up to you. — In my code, I'd definitely want to use the suggested pattern, it's a very easy and inexpensive way to (maybe) err on the safer side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChatGPT is mostly wrong here. I will respond to its points in detail.
The order in CPython is “mark finalizing” → “clear initialized.”
True.
On a foreign thread without the GIL, you can observe finalizing == true while initialized == true for a moment.
True but irrelevant. When finalizing is set, non-daemon threads have already been joined, and daemonic threads are detached (unable to execute Python code). If this is running from a daemonic thread, it doesn't matter which value we observe for IsInitializing, because if we're finalizing then the thread will either freeze or exit when we attempt to acquire the GIL immediately below. (Since GIL acquisition can't fail, it freezes if you try to do it during finalization. This is documented and I believe there's a PEP that's trying to fix it.) We can only wind up actually executing the DECREF if we're not finalizing yet or if we're running in the main thread. But in the latter case, the race does not exist.
Subinterpreters: Py_IsInitialized() is a process-wide flag; it says nothing about whether a particular interpreter (or the one owning your object) is already tearing down. Py_IsFinalizing() is also global today, so neither API tells you “this object’s interpreter is safe.” Treat them as coarse signals, not per-interpreter guarantees.
This does not point at any distinction between the validity of the logic with the IsFinalizing check vs without.
CPython does a lot of decref/cleanup during shutdown, but under tight control:
This is FUD. It's a safe approximation but doesn't understand the details.
It holds the GIL, knows exactly which objects it’s touching, and often orders teardown to avoid arbitrary callbacks (or at least accepts the risk in well-understood places).
The final GC pass executes well after IsFinalizing becomes true and can destroy arbitrary user-created objects that do arbitrary things.
Your extension’s Py_DECREF from a foreign thread or a C++ destructor can invoke: [...]
Foreign thread is specious as explained above. Thus, any shared_ptr destruction that occurs during finalization must be indirectly executing from within a finalizer of a user-provided Python object. Thus it's OK for us to potentially invoke other finalizers of user-provided Python objects.
shared_ptr destruction after finalization is common if a shared_ptr is held in a C++-level global, and it is important that we not call into Python there. That's the main reason for the IsInitialized check.
Capsule destructor case: the fact that a decref inside CPython dropped the capsule to zero (calling your destructor) doesn’t make your subsequent Py_DECREFs safe—your destructor may touch objects tied to interpreters or modules already mid-teardown.
This is confusing the question about whether we can decref an arbitrary object during finalization (answer: yes because we're only doing it from within another decref of an arbitrary object) with the question about what we can do in the capsule destructor. The capsule destructor executes much later - clearing the interpreter state dict is one of the last things done during finalization. We should not run arbitrary Python code there. But we're not: I'm just proposing we have a C++ function that sets a C++ flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're lightyears ahead of me with your understanding of the cpython internals, especially threading and cleanup mechanics. Thanks for the explanation!
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,8 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import sys | ||
| from contextlib import suppress | ||
|
|
||
| import pytest | ||
|
|
||
| from pybind11_tests import local_bindings as m | ||
|
|
@@ -20,6 +23,31 @@ def test_load_external(): | |
| assert m.load_external1(cm.ExternalType2(12)) == 12 | ||
| assert "incompatible function arguments" in str(excinfo.value) | ||
|
|
||
| def test_shared(val, ctor, loader): | ||
| obj = ctor(val) | ||
| with suppress(AttributeError): # non-cpython VMs don't have getrefcount | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @msimacek I just looked: Do you think we could enable more test coverage for GraalPy by adopting Joshua's approach here more widely? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem of gc.collect() not reliably GCing things is broader than the problem of not having refcounts. PyPy doesn't have refcounts but its gc.collect() works (if you call it a few times). |
||
| rc_before = sys.getrefcount(obj) | ||
| wrapper = loader(obj) | ||
| # wrapper holds a shared_ptr that keeps obj alive | ||
| assert wrapper.use_count == 1 | ||
| assert wrapper.value == val | ||
| with suppress(AttributeError): | ||
| rc_after = sys.getrefcount(obj) | ||
| assert rc_after > rc_before | ||
|
|
||
| test_shared(110, cm.ExternalType1, m.load_external1_shared) | ||
| test_shared(220, cm.ExternalType2, m.load_external2_shared) | ||
|
|
||
| with pytest.raises(TypeError, match="incompatible function arguments"): | ||
| test_shared(210, cm.ExternalType1, m.load_external2_shared) | ||
| with pytest.raises(TypeError, match="incompatible function arguments"): | ||
| test_shared(120, cm.ExternalType2, m.load_external1_shared) | ||
|
|
||
| with pytest.raises( | ||
| RuntimeError, match="Foreign instance cannot be converted to std::unique_ptr" | ||
| ): | ||
| m.load_external2_unique(cm.ExternalType2(2200)) | ||
|
|
||
|
|
||
| def test_local_bindings(): | ||
| """Tests that duplicate `py::module_local` class bindings work across modules""" | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.