Skip to content

Conversation

beicause
Copy link
Contributor

Don't call self.to_gd in base_mut to make it work in init and predelete.

BaseMut has no runtime inaccessible guard during initialization because the instance binding is not set yet, but Rust compiler's borrow checking for base and base_mut are still enforced.

@GodotRust
Copy link

API docs are being generated and will be shortly available at: https://godot-rust.github.io/docs/gdext/pr-1324

@Bromeon Bromeon added quality-of-life No new functionality, but improves ergonomics/internals c: ffi Low-level components and interaction with GDExtension API labels Sep 20, 2025
@beicause beicause force-pushed the base_mut-init-predelete branch from b28af45 to 2683a6d Compare September 21, 2025 02:22
Copy link
Member

@Bromeon Bromeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, nice work 👍

I am still worried a bit about two competing mechanisms to access the base object during initialization. While it does solve your problem, how do we explain this from a UX perspective?

If I change your code to the following, it still works:

#[godot_api]
impl IArrayMesh for InitArrayMeshTest {
    fn init(base: Base<ArrayMesh>) -> Self {
        base.to_init_gd()
            .add_surface_from_arrays(mesh::PrimitiveType::TRIANGLES, &make_mesh_arrays());

        Self { base }
    }
}

So yes, we can save a refcount here, but at the cost of multiple ways to do the same thing.

When should someone use to_init_gd(), and when base_mut() + base()?

What about WithBaseField::to_gd()? Should we consider making that behave like current to_init_gd() (not sure if possible)? I wonder if there are scenarios where someone needs to access the base object before inserting it into the Self struct... 🤔

@beicause
Copy link
Contributor Author

I don't really like the to_init_gd hack, which is slower and prevents refcounted being released immediately.

This PR is closer to how this is accessed in godot-cpp, and allows accessing base_mut in predelete. IMO we should prefer this and deprecate/remove to_init_gd.

@Bromeon
Copy link
Member

Bromeon commented Sep 21, 2025

But what about cases where Gd needs to be copied?

Silly example:

struct EntityDatabase {}
impl EntityDatabase {
    fn register(obj: Gd<RefCounted>);
}

#[godot_api]
impl IRefCounted for MyEntity {
    fn init(base: Base<RefCounted>) -> Self {
        EntityDatabase::register(base.to_init_gd());
        Self { base }
    }
}

@beicause beicause force-pushed the base_mut-init-predelete branch from 2683a6d to 87a9b2f Compare September 21, 2025 09:32
@beicause
Copy link
Contributor Author

But what about cases where Gd needs to be copied?

Silly example:

struct EntityDatabase {}
impl EntityDatabase {
    fn register(obj: Gd<RefCounted>);
}

#[godot_api]
impl IRefCounted for MyEntity {
    fn init(base: Base<RefCounted>) -> Self {
        EntityDatabase::register(base.to_init_gd());
        Self { base }
    }
}

I think this can store instance id. Even if we use to_init_gd, there is still no guarantee that Gd is alive. Object can be freed, and RefCounted can also be memdelete by Godot.

@Bromeon
Copy link
Member

Bromeon commented Sep 21, 2025

Even if we use to_init_gd, there is still no guarantee that Gd is alive. Object can be freed, and RefCounted can also be memdelete by Godot.

This is the case for every Gd in existence, at any point in time. It's also well-defined and documented, resulting in a panic.

I don't see how this is an argument against the scenario here.


I think this can store instance id.

That would downgrade type-safety. Imagine the registry doesn't have RefCounted but MyEntity -- now we need a fallible runtime conversion, whereas Gd<MyEntity> could encode this type directly.

Also, it's broken:

fn init(base: Base<ArrayMesh>) -> Self {
    let mut sf = Self { base };
    sf.base_mut()
        .add_surface_from_arrays(mesh::PrimitiveType::TRIANGLES, &make_mesh_arrays());

    let id = sf.base().instance_id();
    let _clone: Gd<ArrayMesh> = Gd::from_instance_id(id); // <--

    sf
}

The above code causes a double-panic and aborts the process.
The cause is debug_assert!(base.obj.is_instance_valid()); failing in from_base().

Notably, the same code works with to_init_gd(). What you call a hack is at least internally consistent 😉

fn init(base: Base<ArrayMesh>) -> Self {
    let id = base.to_init_gd().instance_id();
    let _clone: Gd<ArrayMesh> = Gd::from_instance_id(id);

    let mut sf = Self { base };
    sf.base_mut()
        .add_surface_from_arrays(mesh::PrimitiveType::TRIANGLES, &make_mesh_arrays());

    sf
}

@Bromeon
Copy link
Member

Bromeon commented Sep 21, 2025

The above problem is something I have run into constantly during #1273.

The core conflict is: as soon as you hand out an object to the user, you need to assume they use it the same way how objects generally work. If random operations (like instance ID, cloning, etc) break down, that is unintuitive and will cause frustration. I'm hesitant to sacrifice soundness and logical consistency in the name of performance, especially in safe APIs.

Some things I considered:

  1. hand out a Gd with a bool flag "limited", which panics on certain operations.
    • takes extra space in Gd and punishes general performance
  2. custom wrapper type, you tried this as well in Add WeakGd #1280
    • the problem here is that it proliferates pointer types for this narrow edge case
  3. hand out only a reference to the object, i.e. &RefCounted and not Gd<RefCounted>
    • this looked most promising initially, as Godot's get_instance_id is deleted
    • however, it's trivial to work around with call("get_instance_id", &[])

Ultimately we have to choices: (a) make things safe to use, at slight cost. (b) optimize performance but sacrifice correctness.

I chose (a), also because this cost is only relevant for RefCounted types, not for Node.

If you find that (b) is really worth it, we can consider a special API for it -- but I don't see an easy way to make base_mut() & Co. just work with base-init access, without opening the door for these edge cases. And this is a risk, because the user is used to them working a certain way. If we instead add a new, dedicated API, maybe with a good name, we can make clear that this has certain limitations and needs careful usage.

But for me to understand the background: You mentioned multiple times that godot-cpp allows this to be faster, but do you actually need to avoid this 1 refcount increment so much, that it's worth all this effort? Is it the bottleneck? Don't other parts like Godot/GDExtension object initialization overhead take up a much larger amount of time?

@beicause
Copy link
Contributor Author

Godot doesn't allow incrementing ref count in init otherwise it will be freed before actually referenced. So I think we should never create a strong Gd pointer during init.


I think overhead of cloning a RefCounted is trivail, but it also pushes a callable to the message queue.

@beicause
Copy link
Contributor Author

I think we've fallen into the discussion about RefCounted initialization again, but in my opinion, using base_mut in init is safer and doesn' t involve Gd pointer.

@Bromeon
Copy link
Member

Bromeon commented Sep 21, 2025

I think we've fallen into the discussion about RefCounted initialization again, but in my opinion, using base_mut in init is safer and doesn' t involve Gd pointer.

I demonstrated a problem that we have with base() + base_mut(), but not with to_init_gd(). To me, the main benefit of this approach would be performance (at the cost of flexibility/correctness), so what do you mean when you say it's safer?

Or do you mean it allows using the base/base_mut pattern in more places? (Which I agree is nice, the problem is just that it works "only so far" and then suddenly breaks in non-intuitive ways).

What would you think of a solution based on current to_init_gd(), which would immediately decrement the refcount after init whenever it is constructed from Rust -- but keep the deferred workaround for construction from GDScript MyClass.new() calls?

@beicause
Copy link
Contributor Author

In the demonstration users can statically hold a strong ref to achieve similar things like to_init_gd, but it just hides the fundamental problem that self.to_gd and Gd::from_instance_id are unsound in init. And the same applies in predelete and drop, where to_init_gd is unsound as well. This may be resolved if we prohibit creating strong refs during these, or provide clearer error messages. But base_mut does not have this issue.

What would you think of a solution based on current to_init_gd(), which would immediately decrement the refcount after init whenever it is constructed from Rust

I suspect this might also cause the object to be released after the reference count is decremented.

@Bromeon
Copy link
Member

Bromeon commented Sep 21, 2025

And the same applies in predelete and drop, where to_init_gd is unsound as well.

Fully agree, I haven't gotten around to investigate that yet. My hope is that there's some sort of approach here, too, but I may be too optimistic 😇 That said, @Yarwin will bring this issue up with the GDExtension team and see if there are better ways to deal with RefCounted initialization/destruction.


In the demonstration users can statically hold a strong ref to achieve similar things like to_init_gd, but it just hides the fundamental problem that self.to_gd and Gd::from_instance_id are unsound in init.

But they're only unsound if there is no prior refcount increment, which is what to_init_gd() does. Of course I can't guarantee that the implementation is bug-free, but at least so far, I couldn't find any case causing UB or crashes.


This may be resolved if we prohibit creating strong refs during these, or provide clearer error messages. But base_mut does not have this issue.

The challenge is that it's very hard to identify which operations exactly break. from_instance_id and clone were just my first suspects, but it's well possible that this extends to any APIs that affect the reference count, and this includes Godot APIs that may create internal copies which aren't documented. We can of course try to hunt them down, but it's going to be tough, especially as Godot implementations may also change over time.

Cloning is one of the fundamental operations of RefCounted, and in my opinion, relying on the user to not call it (even indirectly) is just asking for trouble. It introduces an entire class of errors that to_init_gd() simply doesn't have.

@Yarwin
Copy link
Contributor

Yarwin commented Sep 22, 2025

IMO fundamentally this is Godot problem, not ours – the whole problem comes from requirement of performing some work in the constructor (init), while constructor itself just wants to get default instance and call it a day – and said default instance can be used for multitude of things (default values in inspector, information for placeholders etc – stuff for which we don't want to call any side effects).

Doing anything else than returning default instance is and will be inherently cursed no matter what hack/approach we would use. base() is unsound, which is very bad, to_init_gd can do stuff with instance which will be memdeleted by Godot instantly, which is slightly better, but still bad.

We should have gdscript-like init for that (which is launched when script is being attached, which happens ONLY when we get "real" instance ready to use by user) in which we would be able to increase refcount and do whatever we want. Alternatively Godot should provide a flag telling us what this instance is created for. I want to bring it up at GDExtension meeting at 30.09.2025 – we need to tackle this issue since Godot Dotnet will splash against it as well.

Yeah, I've been repeating myself with it, but this whole issue is just silly.

I think overhead of cloning a RefCounted is trival, but it also pushes a callable to the message queue.

IMO this is not a huge deal either – Godot uses callables internally for a lot of stuff and they are doing fine. Custom Callables are reasonably fast either. I don't think it is good design, but it might be the best 🤷.


As for issue itself – not increasing refcount is fine, using base / base_mut for hacking around init/predelete problems is inherently unsound and just wrong. to_init_gd should be preferred.

@beicause
Copy link
Contributor Author

All of Godot's source code implicitly uses base class pointer in constructor and destructor, why would base()/base_mut() be considered wrong? Instead I think referencing object itself like Ref<RefCounted> sf = this/self.to_gd in constructor is wrong.

@Bromeon
Copy link
Member

Bromeon commented Sep 22, 2025

All of Godot's source code implicitly uses base class pointer in constructor and destructor,

Godot's source code is written in a different language, has a different audience with different goals and different design guidelines than user-facing extensions. There is a ton of things that people do in C++ that isn't acceptable in Rust, like raw alloc/free, unchecked spans, etc.


why would base()/base_mut() be considered wrong? Instead I think referencing object itself like Ref<RefCounted> sf = this/self.to_gd in constructor is wrong.

This discussion is starting to go in circles, but I'll gladly repeat it: 🙂

Also, it's broken: [...]

The above code causes a double-panic and aborts the process.
The cause is debug_assert!(base.obj.is_instance_valid()); failing in from_base().

Notably, the same code works with to_init_gd().

The core conflict is: as soon as you hand out an object to the user, you need to assume they use it the same way how objects generally work. If random operations (like instance ID, cloning, etc) break down, that is unintuitive and will cause frustration. I'm hesitant to sacrifice soundness and logical consistency in the name of performance, especially in safe APIs.

In other words:

Important

I consider "doesn't break in random places" an acceptance criterion for new features.

to_init_gd() allows users to work with Gd<RefCounted> like everywhere else: they can clone, store/restore instance IDs, etc. But your solution breaks those operations, with the only benefit being performance.

  • Yes, you could argue that base_mut not working during init is also breaking known features, but "not during init" is a very clear delineation that can be validated and documented. Refcount increments are often opaque to the user and them randomly breaking is surprising.
  • Performance is important, I agree. But the cost here is introducing logic errors and crashes (or worse: UB) into user code that don't exist with the current to_init_gd(). As a maintainer I find this careless, and it's also not in my interest to create more help/support tickets with people reporting "bugs", where we have to respond "oh this is broken by design". You need to see this from a long-term perspective, not as a feature in isolation.
  • Before we introduce unsoundness, I'd like every other path to be exhausted, meaning:
    • We check this with Godot developers (needs patience).
    • We can do optimizations for Rust-side calls that work without the deferred callable.

Also I doubt that this is such a big issue in practice. Not only does to_init_gd() solve the job reasonably for most purposes, but you can also define any custom constructor via #[func] static functions, and use the full object model, because it's constructed at the time. In the big picture, the thing we're trying to address here is really a niche.

TLDR: #1273 is good enough for now. We can think about further improvements, but compromising correctness should be an absolute last resort.

@beicause beicause force-pushed the base_mut-init-predelete branch 3 times, most recently from 21ff544 to cc9c70c Compare September 22, 2025 22:20
@beicause
Copy link
Contributor Author

I consider "doesn't break in random places" an acceptance criterion for new features.

I think we can handle some breakages better such as object being released in init and unreferencing dead RefCounted.

(though I don't really think they are related to my solution, they existed already).

@beicause beicause force-pushed the base_mut-init-predelete branch from cc9c70c to a644d65 Compare September 24, 2025 00:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: ffi Low-level components and interaction with GDExtension API quality-of-life No new functionality, but improves ergonomics/internals
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants