-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: IsDense/Sparse/Contiguous #111964
Comments
Tagging subscribers to this area: @dotnet/area-system-numerics-tensors |
We should probably list some of the other names under the We should likely also put the consideration that this is a property as we expect it to be a common query and therefore cached in the tensor, rather than dynamically determined each call. |
IMO |
Interesting. OnnxRuntime C# api currently just checks if the class is derived from I think I would change it personally to use IsSparse instead. Background and motivationTensors can be either be dense (all elements are next to each other in memory and all elements are represented in memory) or sparse (either elements are not next to each other in memory say from slicing, or you are manipulating the strides to conserve memory while representing more elements than are actually present in memory). We have a way to determine this internally, but there is no public way of doing this. In the cases a user needs to know whether the tensor is dense or sparse, a user has to figure out how to calculate that themselves. We need to expose this to users. We expect this to be a common query so instead of calculating it each time it should be a property. API Proposalnamespace System.Numerics.Tensors;
public interface IReadOnlyTensor<TSelf, T> : IEnumerable<T>
where TSelf : IReadOnlyTensor<TSelf, T>
{
bool IsSparse { get; }
} API UsageTensor<int> tensor = Tensor.Create<int>([1, 2, 3, 4], [2, 2]);
// Will be false.
bool dense = tensor.IsSparse;
// Create a tensor with only 1 element in memory but actually representing a 2 x 2 tensor with all values 1.
Tensor<int> tensor = Tensor.Create<int>([1], [2, 2], [0, 0]);
// Will be true.
bool dense = tensor.IsSparse; Alternative DesignsIsSparse is what is used by PyTorch, but we could do the inverse on our side and something like Onnx Runtime does not have any property to represent this, they just check if the tensor is a RisksThis would be a new api in a preview object, so the risks are very minimal. |
Background and motivation
Tensors can be either be dense (all elements are next to each other in memory and all elements are represented in memory) or sparse (either elements are not next to each other in memory say from slicing, or you are manipulating the strides to conserve memory while representing more elements than are actually present in memory). We have a way to determine this internally, but there is no public way of doing this. In the cases a user needs to know whether the tensor is dense or sparse, a user has to figure out how to calculate that themselves. We need to expose this to users.
We expect this to be a common query so instead of calculating it each time it should be a property.
API Proposal
API Usage
Alternative Designs
IsSparse is what is used by PyTorch, but we could do the inverse on our side and something like
IsDense
, but we would then need another parameterIsContiguous
since dense/contiguous could be separate. Sparse does have some nuance about exactly what it means though in other frameworks (see glossary below).IsView - could be used to refer to anything that is not fully dense/contiguous. See glossary below for additional details.
Onnx Runtime does not have any property to represent this, they just check if the tensor is a
DenseTensor<T>
. I don't think this is a good approach for us.IsDistinct - could be used when the data matches exactly 1 to 1 with its representation. No other frameworks that I could find use this though, so it would be very different from existing frameworks.
Risks
This would be a new api in a preview object, so the risks are very minimal.
Glossary
PyTorch Sparse - https://pytorch.org/docs/stable/sparse.html. PyTorch uses
Sparse
to refer to a tensor where "elements are mostly zero valued." They support 5 different formats of sparse (see prior link). If a tensor is sparse, they also track how many dimensions are represented in a sparse format and how many are represented in a dense format (in the same tensor).PyTorch Views - https://pytorch.org/docs/stable/tensor_view.html. PyTorch uses a view to essentially let you know that this "Tensor" is actually pointing to the memory of another tensor. Kinda like our
TensorSpan
. But these views don't have to be contiguous.PyTorch IsContiguous - https://pytorch.org/docs/stable/generated/torch.Tensor.is_contiguous.html. Basically the same as we would consider contiguous (the data is contiguous in memory), but they provide additional options/details about how the memory is laid out.
OnnxRuntime C# api doesn't have sparse tensors, but their python api does and it matches exactly PyTorch, https://onnxruntime.ai/docs/api/python/api_summary.html#sparsetensor. In fact you can bind PyTorch tensors directly as input/ouput.
TensorFlow also supports sparse tensors, https://www.tensorflow.org/guide/sparse_tensor, they are interepretted the same way as PyTorch, but they only support 1 format compared to PyTorch's 5.
The text was updated successfully, but these errors were encountered: