Skip to content

Conversation

justinchuby
Copy link
Member

@justinchuby justinchuby commented Oct 3, 2025

This PR introduces the tofile method on tensors (similarly named as the one on numpy arrays), which allows for faster write and lower memory usage on external data by bypassing tobytes().

Compatibility with existing TensorProtocols is maintained in the external data module by using tofile only when it is available in the class. The TorchTensor class in PyTorch exporter should be updated accordingly to leverage the new logic when saving.

Note that io time to disk is reduced by 40% below.

Note

TensorProtocol is not updated because we do isinstance() checks on external implementations (PyTorch). Adding the method in the protocol will cause isinstance check to fail on those implementations that have not added the tofile method.

Reference: https://github.com/microsoft/onnxscript/pull/2241/files/b2381658492510a9bcc8c0a8574db7368e33bceb

Before:

________________________________________________________
Executed in   48.08 secs    fish           external
   usr time   60.54 secs    0.00 millis   60.54 secs
   sys time   23.06 secs    1.22 millis   23.06 secs
image image

After:

________________________________________________________
Executed in   45.69 secs    fish           external
   usr time   60.68 secs  244.00 micros   60.68 secs
   sys time   22.22 secs  518.00 micros   22.22 secs
image image

Fix #207

Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Copy link

codecov bot commented Oct 3, 2025

Codecov Report

❌ Patch coverage is 80.70175% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.93%. Comparing base (feb51e5) to head (dafeaf7).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/onnx_ir/_core.py 79.06% 7 Missing and 2 partials ⚠️
src/onnx_ir/external_data.py 66.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #210      +/-   ##
==========================================
+ Coverage   76.83%   76.93%   +0.09%     
==========================================
  Files          40       40              
  Lines        4922     4994      +72     
  Branches      980      998      +18     
==========================================
+ Hits         3782     3842      +60     
- Misses        856      864       +8     
- Partials      284      288       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
@justinchuby justinchuby marked this pull request as ready for review October 4, 2025 00:44
@justinchuby justinchuby requested review from a team and titaiwangms as code owners October 4, 2025 00:44
@justinchuby justinchuby added this to the 0.1.11 milestone Oct 4, 2025
@justinchuby
Copy link
Member Author

cc @iksnagreb

Copy link

sonarqubecloud bot commented Oct 4, 2025

@justinchuby justinchuby changed the title Implement tofile on tensors Implement tofile on tensors to reduce data write time by 40% Oct 6, 2025
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
@justinchuby
Copy link
Member Author

@titaiwangms @gramalingam this is ready for review, thanks.

Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
file: A file-like object with a ``write`` method that accepts bytes, or has an ``fileno()`` method.
"""
if _supports_fileno(file) and isinstance(self._raw, np.ndarray):
# This is a duplication of tobytes() for handling special cases
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pack this to a private function, and document the reason we need it (I would say this is very technical knowledge)?

the tensor is recommended if IO overhead and memory usage is a concern.
To obtain an array, call :meth:`numpy`. To obtain the bytes,
call :meth:`tobytes`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to add tofiile()?

if self._offset is not None:
src.seek(self._offset)
bytes_to_copy = self._length or self.nbytes
chunk_size = 1024 * 1024 # 1MB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering why do we know this is the most efficient chunk size? Do we randomly select it?

"""Return the bytes of the tensor."""
return self._evaluate().tobytes()

def tofile(self, file) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering whether tofile() makes sense to LazyTensor. hmm

data_file.write(b"\0" * (current_offset - file_size))
data_file.write(raw_data)

if hasattr(tensor, "tofile"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What tensors do not have tofile()? torch? Better document this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a tofile() method on Tensors that will avoid potential tobytes() call when serializing
2 participants