Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overriding base class constructor when subclassing builtins #4443

Open
ChayimFriedman2 opened this issue Aug 15, 2024 · 12 comments
Open

Overriding base class constructor when subclassing builtins #4443

ChayimFriedman2 opened this issue Aug 15, 2024 · 12 comments

Comments

@ChayimFriedman2
Copy link
Contributor

It appears there is no way to override the arguments to the base class constructor when subclassing builtin types. It will always be passed the same arguments as #[new].

For example:

use pyo3::prelude::*;
use pyo3::types::PyDict;

#[pyclass(extends=PyDict)]
pub struct PyDefaultDict {
    x: i32,
}

#[pymethods]
impl PyDefaultDict {
    #[new]
    #[pyo3(signature = (x, *_args, **_kwargs))]
    fn new(
        x: i32,
        _args: &Bound<'_, PyAny>,
        _kwargs: Option<&Bound<'_, PyAny>>,
    ) -> Self {
        Self { x }
    }

    fn get_x(&self) -> i32 {
        self.x
    }
}

In Python:

from my_module import PyDefaultDict
PyDefaultDict(42)

# Errors with:
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
# TypeError: 'int' object is not iterable

PyClassInitializer doesn't allow that either.

Originally asked on Stack Overflow: https://stackoverflow.com/q/78873598/7884305.

I'm willing to work on this, if I get input on the API.

I envision an API similar to:

impl<S: PyClass> PyClassInitializer<S> {
    fn from_native<B>(subclass: S, native: Py<B>) -> Self
    where
        S: PyClass<BaseType = B>,
        B: PyClassBaseType<Initializer = PyNativeTypeInitializer<B>>,
}

I don't know if it is actually possible (I don't know enough the Python C API).

@davidhewitt
Copy link
Member

davidhewitt commented Aug 16, 2024

Thanks for the question & offer to help. You are exactly right, and there's even a (years-old) FIXME in our code to make this happen:

// FIXME: Call __new__ with actual arguments

I think personally I'd like to see a syntax which is as close to Python as possible.

Taking the Python equivalent would be

    def __new__(cls, x, *args, **kwargs):
        return super().__new__(cls, *args, **kwargs)

My thought is, how close could we get to the Python syntax? I haven't ever loved the PyClassInitializer name, I find it quite long (and not particularly well documented). I guess something like the following might be nice:

#[pymethods]
impl PyDefaultDict {
    #[pyo3(name = "__new__", signature = (x, *args, **kwargs)]
    fn new<'py>(
        py: Python<'py>
        cls,
        x: i32,
        args: &Bound<'_, PyAny>,
        kwargs: Option<&Bound<'_, PyAny>>,
    ) -> Bound<'py, Self> {
        super::<Self>()
            .new_args(args, kwargs)
            .build(Self)
    }
}

@davidhewitt
Copy link
Member

(Sorry fat finger error in closing!)

@davidhewitt
Copy link
Member

        super::<Self>()
            .new_args(args, kwargs)
            .build(Self)

This is probably not quite the right API, but the main point to make is that what I think we'll have to do is feed the args and kwargs which will be passed into super's __new__ into some kind of builder-like form like this. The proposal you make with from_native taking e.g. Py<Dict> isn't possible because we need to ask dict.__new__ to create a PyDefaultDict instance, and we need to fill in any Rust state immediately for things to be sound.

@DavidAntliff
Copy link

@davidhewitt aside, (I was the original question-asker on SO), while this is looked into, are you aware of any workaround for wrapping a base class with extra __init__() parameters? I'm looking to write the equivalent of defaultdict (although with a default int rather than a lambda, for my simpler use-case).

@davidhewitt
Copy link
Member

I think this is a case where PyO3 needs improving; our current #[new] does nothing to handle __init__ . I think probably the simplest thing to do is for us to support __init__?

@ChayimFriedman2
Copy link
Contributor Author

Then what about:

// The name `PyNativeInitializer` is already occupied...
pub struct PyNativeArgsInitializer { ... }
impl PyNativeArgsInitializer {
    pub fn new0() -> Self { ... }
    pub fn new1(args: impl IntoPy<Py<PyTuple>>) { ... }
    pub fn new(args: impl IntoPy<Py<PyTuple>>, kwargs: Option<Py<PyDict>>) { ... }
}

pub struct PyNativeNonInitializedMarker<T> { ... }
impl<T: PyClass> PyClass for PyNativeNonInitializedMarker<T> {
    type BaseClass = T::BaseClass;
}

pub struct PyNativeMarker<T> { ... }

impl<Native> PyClassInitializer<PyNativeMarker<PyNativeTypeInitializer<Native>>> {
    pub fn from_native(init: PyNativeArgsInitializer) -> Self { ... }
}

I had to complicate the API a bit to make it working with the existing PyClassInitializer, but it works nice.

Usage is the same as with existing PyClassInitializer:

PyClassInitializer::from_native(PyNativeArgsInitializer::new0()).add_subclass(MyClass {}).add_subclass(MySecondClass {})

Then we can remove the existing infrastructure that allows you to initialize a native-derived class without from_native() in a major version.

@davidhewitt
Copy link
Member

davidhewitt commented Aug 18, 2024

Interesting idea. Those names remind me a bit of #4413, however in this case I really like the suggestion of using new0, new1 etc. This will directly map to eventually calling the super type __new__, so the name seems short and appropriate here. We also don't (for the foreseeable future) have the concern of possible vectorcall optimisations.

So overall I'm 👍 on .new(), .new1() and .new0() here to solve the missing API.

@ChayimFriedman2
Copy link
Contributor Author

Why is vectorcall a concern? Can't it work with this API?

@ChayimFriedman2
Copy link
Contributor Author

ChayimFriedman2 commented Aug 18, 2024

Am I allowed to already include the breaking change (for the next major version) in my PR? This will simplify it considerably.

@davidhewitt
Copy link
Member

I think best to open the pr assuming the breaking change and let's then work out what's best during the review process.

I assume vectorcall is a non issue because we're already inside a tp_call handler, which doesn't use vectorcall.

@ChayimFriedman2
Copy link
Contributor Author

Hmm... Even after I allow subclasses to configure their superclasses parameters, this issue still isn't fixed. This is because __init__() is not called by PyO3, so it gets called automatically by Python with the same arguments. Now I have quite a few options:

  1. I can just pass the same parameter I passed to __new__() to __init__(), in the hope they will work.
  2. I can allow the user, in their __new__() procedure, to decide what arguments the native superclass will get (that is, change PyNativeTypeInitializer to also include __init__() parameters).
    1. I can allow them to configure both __init__() and __new__(), or only __init__(), but it seems silly not to support __new__().
  3. I can allow the user override __init__() separately, and call from there to their base class' __init__() however they wish.

@davidhewitt, what do you think is the best option?

@ChayimFriedman2
Copy link
Contributor Author

Also, I realized that very much the same machinery (or very similar) we use to support subclassing builtins, we can use in the future to support subclassing arbitrary Python classes, which sounds very cool.

The design will need to be fleshed out (for example: how do we declare the Python class), but I think it can have great potential.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants