Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Design] How to handle casting to Python types in the Python library? #26984

Open
lydia-duncan opened this issue Mar 25, 2025 · 2 comments
Open

Comments

@lydia-duncan
Copy link
Member

lydia-duncan commented Mar 25, 2025

Casting from a Python type to a Chapel type seems relatively straight-forward so far. Casting the other direction is more difficult (e.g. myChapelSet: owned PySet, 1: owned Value), due to the current reliance on Python interpreters in the initializers for these types.

Limitations that make this more challenging:

  • Having more than one Python interpreter per locale will result in a segfault
    • This is a problem if we try to create one to use in the cast and the user already has their own
  • Casts are limited in the arguments they take
    • The Chapel instance to cast won't have a Python interpreter associated with it
    • The Python type to cast to is a type, not an instance, so won't have a Python interpreter instance associated with it
    • No other arguments can be specified

The main solution I can think of for this problem is for the Python library to be what tracks and creates each interpreter per locale. That will enable our cast operator to find the appropriate interpreter to use for that locale without having to create a new one each time. This does bring up some other questions (courtesy of Jade):

  • Do we just always have the interpreter running?
  • When does it start?
  • When does it end?
  • Can the user control that?
  • Should the user control that?
  • If the user can control when it starts/ends, can they shoot themselves in the foot by doing t: PyTuple when the interpreter isn't running?

I suspect having the Python library be what tracks and creates each interpreter per locale will also be necessary when we get to the point where multiple libraries try to rely on the Python library, since we'll have a similar issue with each library not knowing whether another one has already created an interpreter for their current locale. But that's another issue.

For now, I'm going to focus on casts from the other direction.

Edit: note that "Python type" in this situation is referring to the Chapel representations of Python types that live in the Python library. E.g., PySet, PyList, Value, etc.

@jabraham17
Copy link
Member

To add to this, the reason all Python objects take an interpreter original was to prevent users from shooting themselves in the foot. If new Value(...) doesn't take an interpreter argument, there is nothing to stop a user from trying to use the Python API without creating an interpreter, which will segfault. Python interpreters inherently rely on opaque global state, there is no handle to the interpreter we have to hold on to. But by taking a reference to the interpreter object, it forces users to only use the API correctly.

However, over time the interpreter object has grown to include other state, so its not just about that API nicety anymore. Of course, all of this could be in some global per-locale variable (as Lydia mentioned). But this has its own issues as she outlined already.

I think its important for users to be able to control when their interpreter starts/stops, because they can be such heavyweight objects and have a non-trivial startup cost.

One crazy thought I had was what if we could encode the interpreter as part of the type of a given Python Value, almost like a runtime type. That has its own set of problems, but it would be really nice to be able to write myChplVal: owned Value(interpreter)

@mstrout
Copy link
Collaborator

mstrout commented Mar 25, 2025

How does Python multiprocessing handle multiple interpreters? Could you use Python multiprocessing? The Dragon folks here at HPE have implemented Python multiprocessing so it can operate not only between nodes but also in a federated fashion. They might have some ideas. Pete Mendygral is who you might want to ask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants