- 
                Notifications
    You must be signed in to change notification settings 
- Fork 257
[WIP] Interoperability with other Python binding frameworks #1140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
6dc5a7e    to
    d9ee371      
    Compare
  
    | I am so excited! Thank you for your hard work on this @oremanj. I will do a thorough review in the coming week. Two quick questions just based on the summary: do you plan to also make such a PR for pybind11? Would the idea there be to remove the "conduit" feature and replace it with pymetabind? | 
| 
 Yes; I've started on it already. It's more awkward and less zero-cost than this one due to the presumed need to avoid a pybind11 ABI version bump, but I haven't hit any blockers yet. (An additional type map lookup will be needed on every failed load if there are any imported foreign bindings, rather than knowing whether the particular type of interest has foreign bindings.) 
 I don't know if that would be palatable, since the conduit feature has already been released and might need to be supported for a long time. Unfortunately I missed the window to be included in pybind11's most recent ABI break, which occurred with the 3.0 release in July; I'm guessing the next one might not be for a long time after that. The two features don't clash, though of course there's some cost to doing the attribute lookup for the conduit method when it's not needed. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dear @oremanj,
this is a very impressive piece of work. I did a first pass over the PR, please see my feedback and questions attached.
Besides these, here are three additional high level questions:
This PR lacks tests. I am sure you have them on your end as part of the development effort. What would be to test this feature in practice (via CI) so that we can ensure it runs and keeps on running. Would it make sense to have a separate test repository (to avoid duplication) that gets pulled into the CI matrices of both nanobind and pybind11 so that any breaking changes of either project can be caught before shipping a new revision?
Are there features of nanobind that are not supported by pymetabind? Any caveats?
Is there anything to watch out regarding leak tracking when multiple frameworks are involved? I noticed that pymetabind's leak_safe flag is not used in the actual implementation.
Thanks,
Wenzel
| function(nanobind_add_module name) | ||
| cmake_parse_arguments(PARSE_ARGV 1 ARG | ||
| "STABLE_ABI;FREE_THREADED;NB_STATIC;NB_SHARED;PROTECT_STACK;LTO;NOMINSIZE;NOSTRIP;MUSL_DYNAMIC_LIBCPP;NB_SUPPRESS_WARNINGS" | ||
| "STABLE_ABI;FREE_THREADED;NB_STATIC;NB_SHARED;PROTECT_STACK;LTO;NOMINSIZE;NOSTRIP;MUSL_DYNAMIC_LIBCPP;NB_SUPPRESS_WARNINGS;NO_INTEROP" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general question is whether this feature should be opt-in or opt-out. Given that it adds overheads (even if small), my tendency would be to make it opt-in. (e.g. INTEROP instead of NO_INTEROP)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the feature becomes opt-in, would you reverse the polarity of the macro as well?  In other words, NB_DISABLE_FOREIGN becomes NB_ENABLE_FOREIGN.
Obviously, other build systems do not use nanobind-config.cmake.  By default, any macros you add would not be defined.  Developers would opt-in by defining the new macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just chiming in for another vote for opt-in. I imagine that most projects don't need to pay the cost (as the bindings will be self contained), and the ones that do would probably just use it to use as a transition period and then turn it off again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The authors of a particular extension module don't generally know when they build it whether anyone will want to use its types from a different framework (or a different ABI version of the same framework). I think this is what pybind11 was referring to in their rationale for adding the _pybind11_conduit_v1_ methods unconditionally -- "to avoid "oh, too late!" situations" (pybind/pybind11#5296). I'm happy to switch the default, but I wonder if we might want to leave this question open until we have a better quantification of the cost? Speaking of which, @wjakob if you still have a copy of the benchmark that you used to obtain the performance comparison numbers in the nanobind docs, I think that might be useful here.
        
          
                include/nanobind/stl/unique_ptr.h
              
                Outdated
          
        
      | // Stash source python object | ||
| src = src_; | ||
|  | ||
| // Don't accept foreign types; they can't relinquish ownership | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be guarded with an #ifdef to only compile in the case of interop support being enabled?
Minor: in the nanobind codebase, braces are omitted for if statements with a simple 1-line body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only put the new #ifdefs in libnanobind, because I wanted to avoid "infecting" every piece of client code with a new flag dependency. One way to avoid the extra inst_check overhead without adding an #ifdef here would be to add a new cast flag that disables use of foreign types; how would you feel about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a larger question here of - are we requiring an entire nanobind domain to be interop-capable vs not, or are we allowing different extension modules in the same domain to make different choices on that front? I went for the latter since I didn't want a situation where enabling interop for module A would break its previously-working sharing of types with module B.
| const std::type_info *type; | ||
| PyTypeObject *type_py; | ||
| nb_alias_chain *alias_chain; | ||
| void *foreign_bindings; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this field deserves a comment given that it's unconditionally present (even if interop support is disabled).
In what way is the role of the original alias_chain subsumed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interop-disabled flag doesn't change the ABI version string, so we can't conditionally include fields based on its presence. Will add a comment.
The original alias_chain functionality is now served by the types_in_c2p_fast map in nb_internals, so that we can track aliases for both our types and foreign types.
        
          
                include/nanobind/nb_class.h
              
                Outdated
          
        
      | detail::nb_type_set_foreign_defaults(export_all, import_all); | ||
| } | ||
| template <class T = void> | ||
| inline void import_foreign_type(handle type) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These will require documentation. I am not sure why a foreign type would need to be explicitly imported/exported through this API in user code. Isn't this something that the framework will do automatically for us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I understand docs are needed and just hadn't gotten to them yet.
The user can decide whether or not to import everything ABI-compatible by default (set_foreign_type_defaults). If they don't, they can import specific types using this function. Even if they do, this function is useful for types from a different language, such as pure-C types that don't have a type_info. The user provides the mapping between type_info and Python type by calling this function, and asserts that they have verified ABI compatibility.
| return detail::nb_type_lookup(&typeid(detail::intrinsic_t<T>)); | ||
| return detail::nb_type_lookup(&typeid(detail::intrinsic_t<T>), false); | ||
| } | ||
| template <typename T> handle maybe_foreign_type() noexcept { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this function? I don't think it is called anywhere? The alternative would be to add a bool parameter to type().
if we need to have a function, then I would prefer the name type_maybe_foreign.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's like nb::type() but it can return a foreign type also. I found it useful in client code. I'm indifferent between a bool parameter and separate function, so much so that I seem to have made different choices for two adjacent functions - regardless of which direction we go, we can pick one scheme and use it for both.
| if (!internals->foreign_registry) | ||
| register_with_pymetabind(internals); | ||
| pymb_framework* foreign_self = internals->foreign_self; | ||
| pymb_binding* binding = pymb_get_binding(pytype); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is a naïve question. Why would we need to look up __pymetabind_binding__? Won't we be notified of new frameworks/bindings using the hooks?
Or is the idea that the metabind feature is enabled lazily, and if we join the party late then that registry is empty to start with? Still then, I am wondering why we can't populate our own tables from the list of types in the registry, without having to touch the __pymetabind_binding__ member.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Via the hooks, each type registered by another framework will be checked using our should_autoimport_foreign function (at the top of this file) and we'll add it to our tables if that returns true. nb_type_import allows manual handling of cases that don't get handled automatically: importing individual types if we aren't importing everything, and importing types where we don't automatically know the C++ equivalent (such as types defined by pure-C frameworks that don't have C++ RTTI -- I figured trying to autogenerate a fake std::type_info for these would be too complex).
We use the pymb_binding capsule because it's the quickest way to get the binding structure when we already have a type object, and the type object is the most obvious way for the user to name a specific binding. We could instead trawl through the registry for a binding whose Python type matches what we were given, but that would require a linear search.
Or is the idea that the metabind feature is enabled lazily, and if we join the party late then that registry is empty to start with?
There's no issue around when we join; we'll be notified of all existing registrations (using the same hooks that would be called for new registrations) from inside our call to pymb_add_framework.
Still then, I am wondering why we can't populate our own tables from the list of types in the registry, without having to touch the pymetabind_binding member.
In order to register a type, we need a C++ type_info structure; types bound in other languages don't have those.
| * - 0b10: pymb_binding*: This C++ type is not bound in the current nanobind | ||
| * domain but is bound by a single other framework. | ||
| * | ||
| * - 0b11: nb_foreign_seq*: This C++ type is not bound in the current nanobind | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naïve question: what is the purpose of mapping a type to multiple frameworks?
Previously, from-python or to-python conversions might not work out, because a type is not registered at all. With pymetabind, there is now a way out because we can use the other framework to do the conversion for us. If multiple frameworks bind the same type, then this adds complexity. (e.g. the alloca() / complex locking code path in nb_foreign.cpp). I am wondering if we can end up with a simpler solution when scrapping this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple extension modules can independently bind the same C++ type. For example, maybe one extension module binds T, and two others each have a need for a bound (rather than type-cast) std::vector<T>. If they're all in the same domain, then the second attempt to bind the vector will notice that such a binding already exists, and reuse it. But if they're in different domains, then each domain might already have its own std::vector<T> binding by the time the separate domains become aware of each other. For proper interoperability, a function that takes std::vector<T>& should be able to accept a pyobject that wraps std::vector<T> regardless of which domain it comes from. Supporting that in generality requires allowing multiple bindings for a type.
Note that we only need to consider the possibility of multiple frameworks for from-Python conversions. For to-Python, we expect that the first framework we try will succeed, since we already know the type is registered. So it's possible to imagine an alternative where we store a single binding plus a flag that means "there are multiple bindings for this type, check the capsule when doing from-Python conversions". But this runs into the problem that the capsule could be shadowed by inheritance: you want a Base, but type(arg).__pymetabind_binding__ is a capsule that wraps the binding for Derived. Without a good cross-language way to specify that we want the Base subobject, we can't pass the Derived binding to its framework->from_python() and expect to get a valid Base pointer. Relying on an attribute of the incoming object also would break implicit conversions.
| Thank you for the thorough review! I responded to some of the inlines; still working on the others. 
 I haven't had a chance to write self-contained tests yet (most of my testing has been in the context of a large production system that uses both binding frameworks, which is useful but not really exportable), but should be able to get to that this week. I was planning to go the route that pybind11 used for the tests of their conduit feature, where the repository test suite contains a separate small extension module that demonstrates/exercises the API. I think there is some value in having both pybind11 and nanobind have self-contained tests that the functionality they advertise works, without either having any dependence on the other. Since pybind11/nanobind interop is the "headline feature", though, it probably also makes sense to have some specific tests of that. Maybe those should even live in the pymetabind repository so that they can be deduplicated and used by both clients. 
 pymetabind doesn't expose the operation of relinquishing ownership from Python to C++ by passing a pyobject to a C++ function that takes a unique_ptr. That is the only missing piece I'm aware of from nanobind's perspective. Now that pybind11 also supports this, it might make sense to allow cross-framework relinquishment; curious for your thoughts there. 
 My plan was for nanobind to suppress its own leak warnings if any other framework sets leak_safe to false. My practical experience was that nanobind would otherwise issue lots of warnings once any nanobind default arguments had pybind11-bound types. Looks like I initially wrote this in terms of  | 
| I think I've responded to all your comments, and have updated both this PR and the pymetabind repository with the changes to pymetabind.h. Will continue to work on tests and docs this week. | 
- Update pymetabind - Complete nanobind documentation of the new feature - Change "foreign" to "interop" in some places so that the word "foreign" is more consistently used for the other framework rather than the information exchange between them - Allow enum types to participate in interop - Allow nanobind to register implicit conversions from foreign types to nanobind types
| Great news: since type objects use deferred reference counting on the FT build, they can only be freed during garbage collection. GC clears weakrefs with all threads stopped, at a time when the referents of the weakrefs are still fully usable. A thread can only be stopped with its cooperation, typically when executing Python code (it uses the same eval-breaker mechanism for handling signals or yielding the GIL on GILful builds). If we create a weakref to a binding's type object, and that weakref is unexpired when we begin to use the binding, we can rely on it remaining unexpired until we call into arbitrary Python or blocking code (anything that could release the GIL if we had one). So I'm pretty sure it will be possible to remove all the tricky try_ref_binding and alloca stuff, without sacrificing support for multiple bindings per cpptype. This also means that with the changes in this PR to the structure of the fast c2p maps, we should be able to stop immortalizing nanobind types on FT pretty easily. Note that weakref callbacks run when the world is not stopped, as do tp_finalize and tp_dealloc slots. However, there is a second world-stop after weakref callbacks and tp_finalize slots run and before references are cleared to do the deallocation. (It allows the GC to tell whether anything was resurrected.) So I think all nanobind would need to do in order to drop immortalization (beyond this PR) is unregister types from the metatype's tp_finalize rather than tp_dealloc. Still working on incorporating this realization into a simplification of pymetabind. | 
| Hi Jason, this sounds great! Just one quick thought about immortalization. The last time I looked into this, it seemed to me that deferred reference counting was a feature that is mainly usable by the Python bytecode interpreter. C++ binding code that increases/decreases reference counts of type objects does not benefit and would still access the global counter. It's possible that this changed in the meantime, or that I am simply confused. The potential pitfall of of reference counting contention on a type object is quite severe, and that problem just goes away when making them immortal. That was the rationale for the current design. | 
| 
 This is absolutely true. I'm pointing out a side effect of the fact that deferred reference counting is used for these types. Since DRC means the reference counts in the object header are not up-to-date, the only way to tell the true number of references is by scanning every thread's bytecode interpreter stack. This can only be done in a consistent way if all threads are stopped. If we can guarantee that we are not able to take a thread-stop in the middle of some operation of interest, which is pretty easy if we're not calling arbitrary Python code, then we can guarantee that any deferred-refcount objects that were fully alive (weakrefs not cleared, finalizers not called) at the beginning of that operation remain safe to access and incref until the end of it. Even if the last reference to it is in fact dropped during our operation, the GC won't be able to prove it without a world-stop. 
 Free-threaded Python actually has a separate refcounting optimization for type objects and code objects to avoid this contention. Each one gets assigned a small-integer ID, unique among all objects of that type that are simultaneously alive, and every thread state carries a vector of refcounts for these objects. When you create a new instance of a type, you grab the type's unique ID (stored inline within the PyTypeObject) and increment the corresponding slot in your own thread state's type-refcount vector - no contention. The types that use this scheme are required to also use deferred reference counting (since the distributed per-thread refcounts carry the same implication where you can't tell for sure how many references exist without stopping the world), and thus obtain the same result where type/code objects can only be deallocated during GC. The only contention on the shared refcount in the PyObject header of a type object comes from direct calls to Py_INCREF/Py_DECREF, which do exist but are avoided by the most common paths to create and deallocate instances. Unfortunately the API to directly perform the optimized incref/decref for type objects is hidden in a private pycore header. (Search for  | 
942e372    to
    9291df9      
    Compare
  
    72027d5    to
    b7e8133      
    Compare
  
    …can prevent mutual recursion between two frameworks failing to perform a cast. Simplify enum destruction. Clean up some things I noticed while updating the pybind11 PR.
f3fb9bf    to
    fc33860      
    Compare
  
    | I would find it useful to get your thoughts on some of the design questions raised in the threads here, so that we can aim for harmonious semantics between pybind11 and nanobind, but no need to take a closer look at the code until the pybind11 review finishes. I'm scheduled to talk to Ralf about it this weekend. Specific places where input would be useful, if you have opinions about any of them: 
 | 
| @wjakob To offer a status update here, the pybind11 version of this PR has gone through some fairly extensive offline review and I will be making the relatively minor changes requested by that review in the coming week or two. Once that's reached a state of ready-to-merge-in-principle, I'll come back here and work on this one to get to a state that you're happy with. I am currently planning to make the feature in nanobind be gated behind a build flag, but enabled by default at runtime if building in that mode. pybind11 is likely to take the feature as default-on due to the different tradeoffs they take around build size/cost vs "it just works". There is an open question of whether CPython will accept responsibility for managing the bindings registry, rather than needing each project that uses it to carry a copy of pymetabind.h. I will consult with some CPython core devs that have expressed interest in pymetabind to see if this seems plausible to them. If it does, there will likely be a PEP-sized delay of a few months between this PR being ready-in-principle and it being able to be included in a nanobind release. (The PEP process would likely result in API changes to the functions and structures currently exported by pymetabind; pymetabind would want to accommodate those so it can be used as a backport library with low friction; it can't really do that if it's already committed to an ABI that's used in released code.) | 
See also pybind/pybind11#5800, the same feature for pybind11.
pymetabind is a proposed standard for Python <-> C-ish binding frameworks to be able to find and work with each other's types. For example, assuming versions of both nanobind and pybind11 that have adopted this standard, it would allow a nanobind-bound function to accept a parameter whose type is bound using pybind11, or to return such a type, or vice versa. Interoperability between different ABI versions or different domains of the same framework is supported under the same terms as interoperability between different frameworks. Compared to pybind11's
_pybind11_conduit_v1_API, this one also supports implicit conversions and to-Python conversions, and should have significantly less overhead.The essence of this technique has been in use in production by my employer for a couple of years now to enable a large amount of pybind11 binding code to be ported to nanobind one compilation unit at a time. Almost everything that works natively works across framework boundaries too, at only a minor performance cost. Inheritance relationships and relinquishment (from-Python conversion of
unique_ptr<T>) don't work cross-framework, but those are the only limitations I'm aware of.This PR adds nanobind support for exposing nanobind types to pymetabind for other frameworks to use ("exporting") and using other frameworks' types that they have exposed to pymetabind ("importing"). Types bound by a different framework than the extension module's own nanobind domain are called "foreign". There are some internals changes to allow foreign types to be represented in the same type maps as native nanobind types; this also includes an updated version of the per-thread fast c2p map that allows safe removal of types (since we can make our own types immortal but we can't force everyone else to make their types immortal). It is possible to compile nanobind without the code to support interop, using the new cmake option
NO_INTEROP.Current status: nominally code complete and existing tests pass, but I haven't added interop-specific tests or public-facing docs yet.
Performance: I have not yet measured the performance impact of this change, but I expect it to be quite low in situations where the foreign bindings don't need to be used. The new type_c2p_fast caches negative lookups, and we note whether any foreign bindings exist for a C++ type at the same time as we look up the nanobind type for it. If any foreign bindings have been imported, we do need to look up in type_c2p_fast before failing in some cases where we previously could avoid a lookup completely. When the foreign bindings do need to be used to perform a cast, they require a second c2p_fast lookup and some likely-modest indirection overhead.
Memory cost: Exporting a type allocates a 56-byte structure, a capsule object to wrap it, and adds that capsule object to the type's dictionary. Importing a type adds a new entry to the type_c2p_slow map.
Code size: With
NO_INTEROP,size libnanobind.aadds up to 8533 bytes smaller than baseline on my machine (an arm64 mac), probably due to reusingnb_ptr_mapfor the type_c2p_fast map. WithoutNO_INTEROPlibnanobind.a is 8983 bytes larger than baseline.Things that need to happen before this can be released:
[x] add user-facing documentation
[x] add unit tests
[ ] test correctness of nanobind/pybind11 interop
[ ] test performance
[ ] solicit feedback from maintainers of other binding libraries
[ ] release pymetabind v1.0, incorporating said feedback