- 
                Notifications
    You must be signed in to change notification settings 
- Fork 52
Description
This RFC proposes adding an API to the specification for explicitly broadcasting a list of shapes to a single shape.
Overview
Based on array API comparison data, this API, or some variation of it, is commonly implemented across array libraries.
Currently, the Array API specification only includes broadcast_arrays and broadcast_to which both require array input. The specification lacks APIs for working directly with shapes without needing to create new array instances.
Prior Art
- NumPy: https://numpy.org/doc/stable/reference/generated/numpy.broadcast_shapes.html
- added in 2020: ENH: add function to get broadcast shape from a given set of shapes. numpy/numpy#17535
- returns a Tuple
 
- CuPy: does not currently support- Correction: CuPy simply borrows broadcast_shapesfrom NumPy: https://github.com/cupy/cupy/blob/a888cc94c79729cf24ebb808d15b9702c0342392/cupy/__init__.py#L302
 
- Correction: CuPy simply borrows 
- Dask: da.core.broadcast_arraysexists as private API only. Supports Dask's bespokenan's in the shape.
- JAX: follows NumPy
- returns a Tuple
 
- PyTorch: follows NumPy
- returns a Size
 
- returns a 
- TensorFlow: has two APIs for statically and dynamically known shapes
- broadcast_static_shape: https://www.tensorflow.org/api_docs/python/tf/broadcast_static_shape
- broadcast_dynamic_shape: https://www.tensorflow.org/api_docs/python/tf/broadcast_dynamic_shape
- both functions only accept two shape arguments
 
- ndonnx: no API. Shapes can contain None, so one cannot use numpy's implementation.
Proposal
This RFC proposes adding the following API to the specification:
def broadcast_shapes(*shapes: tuple[int | None, ...]) → tuple[int | None, ...]in which one or more shapes are broadcasted together according to broadcasting rules as enumerated in the specification.
Questions
- 
How to handle shapes having unknown dimensions? - dask.array.core.broadcast_shapessets the output size to nan if any of the input shapes are nan on the same axis
- ndonnx.broadcast_arrays(a, b)returns arrays with material shapes.
 Note that shape materialization can be a very expensive operation, as it requires materializing the whole graph until that point. In the case of Dask, which doesn't cache intermediate results as a deliberate memory management policy, this means computing everything at least twice. 
Notes
The top-level page on broadcasting mentions on the first line, using non-prescriptive language, that broadcasting allows creating views of the inputs:
Broadcasting refers to the automatic (implicit) expansion of array dimensions to be of equal sizes without copying array data
However, no mention of sharing memory is made in broadcast_to or broadcast_arrays.
For the sake of comparison, see the verbiage in asarray(copy=False).
The problem with this ambiguity is that one can work around the lack of broadcast_shapes by calling xp.broadcast_arrays(*args)[0].shape, but there is no strong guarantee that the backend won't deep-copy the inputs.
Note that numpy.broadcast_shapes doesn't work with shapes containing None (ndonnx and hopefully in the future JAX too) or NaN (Dask; non-standard).
I suggest to either
- Add prescriptive verbiage to broadcast_toandbroadcast_arraysthat the output must share memory with the input, or in other words the operation must be O(1), or
- Add broadcast_shapesto the standard, and change the verbiage of the broadcasting high level page to "typically without copying array data"
For the time being I am adding the function to array_api_extra:
Metadata
Metadata
Assignees
Labels
Type
Projects
Status