Add direct conversion of out-of-core arrays to PyTorch and TensorFlow tensors#47
Add direct conversion of out-of-core arrays to PyTorch and TensorFlow tensors#47Copilot wants to merge 4 commits into
Conversation
Co-authored-by: j143 <53068787+j143@users.noreply.github.com>
Co-authored-by: j143 <53068787+j143@users.noreply.github.com>
Co-authored-by: j143 <53068787+j143@users.noreply.github.com>
There was a problem hiding this comment.
Solid implementation overall with thoughtful design. However, before merging, I'd like clarification on several critical questions:
API DESIGN & STABILITY
-
Public API Status: Should
to_torch()andto_tensorflow()be marked experimental in docstrings? First-cut APIs often need breaking changes. -
Zero-Copy Preconditions: Document EXACTLY when zero-copy is guaranteed:
- Writable memmap on CPU: ✓
- Read-only memmap: falls back to copy
- Non-contiguous/sliced arrays: unclear
- Alignment, endianness requirements?
PYTORCH-SPECIFIC CONCERNS
-
CUDA Safety (line 231):
cpu_tensor.to(device)can fail silently if CUDA unavailable. Need error handling BEFORE .to() call. -
Memmap Tensor Lifetime: What happens if underlying file is closed/resized while tensor is in use? Document constraints for training loops.
-
Gradient Flow: Test that zero-copy tensors support
requires_grad=True. Current tests don't verify this. -
Non-Contiguous Arrays: Add tests for transposed/sliced arrays to verify
torch.from_numpy()semantics.
TENSORFLOW CONCERNS
-
Buffer Protocol: Which TF API is used? Test both eager and graph execution modes.
-
Device Placement: Always CPU? Should match PyTorch with device parameter?
-
Dtype Coverage: Tests only check float32. Need float16, int64, bool, complex.
TESTING GAPS
-
Lazy Array Behavior: Do you cache materialized result or recompute on subsequent calls? Test both
.to_torch()on same lazy array. -
Interleaved Operations: No test for
.compute()->.to_torch()sequences. -
Error Modes: Invalid device specs, undersized tensors, unsupported dtypes.
DOCUMENTATION
-
Failure Examples: Add README example showing error when GPU unavailable.
-
Copy Semantics: Clearly document when zero-copy vs. actual copy happens.
This is good foundational work with solid error handling. These questions are mainly about clarifying API contracts before this becomes stable.
Enables seamless conversion of memory-mapped Paper arrays to device tensors for deep learning workflows. Minimizes memory usage through zero-copy conversion where possible.
Changes
Core API (
paper/numpy_api.py)ndarray.to_torch(device=None)- Converts to PyTorch tensor with optional device specificationndarray.to_tensorflow()- Converts to TensorFlow tensor via buffer protocolTesting (
tests/test_tensor_conversion.py)Documentation
examples/tensor_conversion_demo.py)Usage
Notes
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.