Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maybe a Practical Solution for Accessing the Inner Objects of Vector #4891

Open
Yikai-Liao opened this issue Feb 4, 2025 · 1 comment
Open

Comments

@Yikai-Liao
Copy link

Yikai-Liao commented Feb 4, 2025

Hi, PyO3 team,

I'm currently developing symusic, a C++ library that provides Python bindings using nanobind.

Initially, I attempted to use PyO3 for this library but ran into issues with nested structs—as mentioned in your FAQs. Allocating many small objects on the Python side would significantly slow down the program, which I wanted to avoid.

Subsequently, I switched to a C++ solution and managed to return references to objects within a vector successfully. However, this approach introduced the risk of dangling pointers when new objects are appended (since the vector might reallocate its memory if its capacity is exceeded). This problem, I suspect, is also a core concern in PyO3.

Introducing pyvec

To address these challenges, I developed a new container called pyvec. This container balances performance and memory safety by storing objects in multiple chunks managed by a single shared pointer, while maintaining a separate vector of pointers to preserve their order.

Key Advantages of pyvec

  • Improved memory contiguity: Both the pointer array and most objects are stored contiguously in memory, enhancing cache locality and overall performance.
  • Avoidance of repeated allocations: Reduces the need for frequent small allocations on the Python side.
  • Elimination of dangling pointers: Prevents common pitfalls associated with vector reallocation.
  • Efficient vector conversion: Allows for quick creation from a standard vector (using SIMD instructions to generate a pointer vector).
  • Fast slicing with shallow copies: Enables efficient slicing operations without deep copying data.

Why pyvec Works Well for symusic

For symusic, these limitations are not a concern. In most use cases, users primarily read data from the library rather than inserting or deleting elements dynamically (And they find that the older unsafe version without pyvec also works well in their situation).

As a result, the potential downsides of pyvec—such as delayed deallocation and reduced memory locality after shuffling—do not significantly impact performance. Given this usage pattern, pyvec remains a highly practical solution for my needs.

Finally

I've been using this solution in symusic since version 0.5.0, and it has proven to be quite stable. I would like to introduce this approach to the PyO3 community to explore whether a container with similar strategy could be beneficial within PyO3.

Looking forward to your thoughts!

Sincerely,
Yikai Liao

@Yikai-Liao
Copy link
Author

After searching, I found that some third‑party libraries in Rust can provide functionality similar to shared_ptr, such as owning‑ref‑rs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant