Skip to content

Commit ae4c3bc

Browse files
jhavukainenmalfet
andauthored
MPS commits organized and cleaned up (#94)
* MPS commits organized and cleaned up * Further condense added ops * Addressing PR comments * Replace unintended slash with a dot * Update 2.9.0/done/result_mps.md Co-authored-by: Nikita Shulga <[email protected]> * Address additional comments --------- Co-authored-by: Nikita Shulga <[email protected]>
1 parent e5f8adb commit ae4c3bc

File tree

2 files changed

+113
-105
lines changed

2 files changed

+113
-105
lines changed

2.9.0/done/result_mps.md

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# Release Notes worksheet mps
2+
3+
The main goal of this process is to rephrase all the commit messages below to make them **clear and easy to read** by the end user. You should follow the following instructions to do so:
4+
5+
* **Please clean up and format commit titles to be readable by the general PyTorch user.** Make sure you're [following the guidance here](https://docs.google.com/document/d/14OmgGBr1w6gl1VO47GGGdwrIaUNr92DFhQbY_NEk8mQ/edit)\! Your resulting notes must be consistent and easy to read.
6+
* Please sort commits into the following categories (you should not rename the categories\!), I tried to pre-sort these to ease your work, feel free to move commits around if the current categorization is not good.
7+
* Anything that is not public facing needs to be removed.
8+
* If anything is miscategorized/belongs to another domain, move it to `miscategorized.md`.
9+
* Please scan through `miscategorized.md` and handle any commits that belong within your domain according to these instructions.
10+
* We place a lot of emphasis on the “BC-breaking” and “deprecation” sections. Those should be where the most effort goes in. The “improvements” and “bug fixes” for Python API should be nice as well.
11+
* Once you are finished, move this very file from `todo/` to `done/` and submit a pull request.
12+
13+
The categories below are as follows:
14+
15+
* BC breaking: All commits that are BC-breaking. These are the most important commits. If any pre-sorted commit is actually BC-breaking, do move it to this section. Each commit should contain a paragraph explaining the rational behind the change as well as an example for how to update user code [BC-Guidelines](https://docs.google.com/document/d/14OmgGBr1w6gl1VO47GGGdwrIaUNr92DFhQbY_NEk8mQ/edit#heading=h.a9htwgvvec1m).
16+
* Deprecations: All commits introducing deprecation. Each commit should include a small example explaining what should be done to update user code.
17+
* new\_features: All commits introducing a new feature (new functions, new submodule, new supported platform etc)
18+
* improvements: All commits providing improvements to existing feature should be here (new backend for a function, new argument, better numerical stability)
19+
* bug fixes: All commits that fix bugs and behaviors that do not match the documentation
20+
* performance: All commits that are added mainly for performance (we separate this from improvements above to make it easier for users to look for it)
21+
* documentation: All commits that add/update documentation
22+
* Developers: All commits that are not end-user facing but still impact people that compile from source, develop into pytorch, extend pytorch, etc
23+
* not user facing: All commits that are not public end-user facing and hence should be dropped from the release notes
24+
25+
## mps
26+
27+
### bc breaking
28+
29+
- Build metal kernels of MacOS-14+ and remove all pre-MacOS-14 specific logic. Requires MacOS-14+ going forwards. ([\#159733](https://github.com/pytorch/pytorch/pull/159733), [\#159912](https://github.com/pytorch/pytorch/pull/159912))
30+
31+
### deprecation
32+
33+
### new features
34+
- [Beta] Partial sparse support for MPS backend ([\#159729](https://github.com/pytorch/pytorch/pull/159729), [\#160254](https://github.com/pytorch/pytorch/pull/160254), [\#160223](https://github.com/pytorch/pytorch/pull/160223), [\#161846](https://github.com/pytorch/pytorch/pull/161846), [\#162007](https://github.com/pytorch/pytorch/pull/162007))
35+
36+
### improvements
37+
38+
- Add `shifted_chebyshev_polynomial_[tuvw]`, `igamma/igammac,grid_sampler_3d, native_dropout`/`native_dropout_backward` ([\#157488](https://github.com/pytorch/pytorch/pull/157488), [\#161927](https://github.com/pytorch/pytorch/pull/161927), [\#160541](https://github.com/pytorch/pytorch/pull/160541), [\#162108](https://github.com/pytorch/pytorch/pull/162108))
39+
- Extend atomic operations to all int types ([\#158179](https://github.com/pytorch/pytorch/pull/158179))
40+
- Extend `index_put` to complex types ([\#160159](https://github.com/pytorch/pytorch/pull/160159))
41+
- Extend addmm to integral types ([\#160270](https://github.com/pytorch/pytorch/pull/160270))
42+
- Add support for unsigned types ([\#159094](https://github.com/pytorch/pytorch/pull/159094))
43+
- Add API to query GPU core count ([\#160414](https://github.com/pytorch/pytorch/pull/160414))
44+
- Add `kthvalue` ([\#161817](https://github.com/pytorch/pytorch/pull/161817))
45+
- Type-promote tensor-iterator common dtype ([\#160334](https://github.com/pytorch/pytorch/pull/160334))
46+
- Implement logcumsumexp metal kernel ([\#156858](https://github.com/pytorch/pytorch/pull/156858))
47+
- Enable dlpack integration ([\#158888](https://github.com/pytorch/pytorch/pull/158888))
48+
- Dynamic reductions ([\#159355](https://github.com/pytorch/pytorch/pull/159355))
49+
- Update `avg_pool2d` to use Metal kernel when `ceil_mode=True` ([\#161011](https://github.com/pytorch/pytorch/pull/161011))
50+
51+
### bug fixes
52+
53+
- Fix batch norm incorrect gradient ([\#156867](https://github.com/pytorch/pytorch/pull/156867))
54+
- Do not crash if tensor dim \> INT\_MAX ([\#158824](https://github.com/pytorch/pytorch/pull/158824))
55+
- Avoid outputing zeros from `exponential_` for MPS ([\#159386](https://github.com/pytorch/pytorch/pull/159386))
56+
- Fix MPS autocast for ConvTranspose3d ([\#160345](https://github.com/pytorch/pytorch/pull/160345))
57+
- Fix MPS conv3d autocast bias dtype mismatch ([\#160423](https://github.com/pytorch/pytorch/pull/160423))
58+
- Fix error check for torch.var on scalar ([\#160889](https://github.com/pytorch/pytorch/pull/160889))
59+
- Fix index\_add for complex \+ int64 ([\#160926](https://github.com/pytorch/pytorch/pull/160926))
60+
- Fix constant\_pad\_nd\_mps bug when pad is empty ([\#161149](https://github.com/pytorch/pytorch/pull/161149))
61+
- Fix index\_select for scalar\_types ([\#161206](https://github.com/pytorch/pytorch/pull/161206))
62+
- Fix `index_copy` for scalars ([\#161267](https://github.com/pytorch/pytorch/pull/161267))
63+
- Fix index\_copy for strided indices ([\#161333](https://github.com/pytorch/pytorch/pull/161333))
64+
- Fix index\_add for int64 input \+ zerodim index ([\#161511](https://github.com/pytorch/pytorch/pull/161511))
65+
- Ensure that tensors are contiguous before using MPS linear kernel ([\#161641](https://github.com/pytorch/pytorch/pull/161641))
66+
- Address NaNs if SDPA is called with all values masked from query ([\#157727](https://github.com/pytorch/pytorch/pull/157727))
67+
- Fix invalid formatting ([\#158436](https://github.com/pytorch/pytorch/pull/158436))
68+
- Fix empty input in posneg functions ([\#161824](https://github.com/pytorch/pytorch/pull/161824))
69+
- Migrate round unary op to Metal ([\#161712](https://github.com/pytorch/pytorch/pull/161712))
70+
- Type-promote tensor-iterator common dtype ([\#160334](https://github.com/pytorch/pytorch/pull/160334))
71+
72+
### performance
73+
74+
- Optimize cummin/cummax metal kernels ([\#156794](https://github.com/pytorch/pytorch/pull/156794))
75+
- Speedup torch.full for 1-byte types ([\#158874](https://github.com/pytorch/pytorch/pull/158874))
76+
- Speedup `argmax`/`argmin` ([\#159524](https://github.com/pytorch/pytorch/pull/159524))
77+
- Improve performance of max\_pool3d ([\#157875](https://github.com/pytorch/pytorch/pull/157875))
78+
- Avoid calling tensor ops in max\_pool3d impl ([\#157874](https://github.com/pytorch/pytorch/pull/157874))
79+
- Move max\_pool2d to Metal for `stride != 1` ([\#157876](https://github.com/pytorch/pytorch/pull/157876))
80+
81+
### docs
82+
83+
### devs
84+
85+
### Untopiced
86+
87+
### not user facing
88+
89+
- Move sparsemps testing from test\_mps to test\_sparse ([\#161852](https://github.com/pytorch/pytorch/pull/161852))
90+
- Fix unused vars in GridSampler ([\#160850](https://github.com/pytorch/pytorch/pull/160850))
91+
- Delete `as_strided_tensorimpl_mps` ([\#157772](https://github.com/pytorch/pytorch/pull/157772))
92+
- Add benchmark for scan with indices ([\#156860](https://github.com/pytorch/pytorch/pull/156860))
93+
- Fix deduplication of kernels ([\#156843](https://github.com/pytorch/pytorch/pull/156843))
94+
- Move array def to `c10/metal/common.h` ([\#157746](https://github.com/pytorch/pytorch/pull/157746))
95+
- Use `simdgroup_size` constexpr ([\#157751](https://github.com/pytorch/pytorch/pull/157751))
96+
- Delete redundant header ([\#157966](https://github.com/pytorch/pytorch/pull/157966))
97+
- Move repeated code into helper functions ([\#158178](https://github.com/pytorch/pytorch/pull/158178))
98+
- Enable test\_aot\_inductor.py tests ([\#155598](https://github.com/pytorch/pytorch/pull/155598))
99+
- Enable test\_indexing on MPS ([\#158582](https://github.com/pytorch/pytorch/pull/158582))
100+
- Fix compilation warning in Pooling.metal ([\#158729](https://github.com/pytorch/pytorch/pull/158729))
101+
- Remove unused `ndArrayFromTensor` ([\#158823](https://github.com/pytorch/pytorch/pull/158823))
102+
- Fix cpu kernel generation ([\#158350](https://github.com/pytorch/pytorch/pull/158350))
103+
- Improve tabbing in cpp generation ([\#158351](https://github.com/pytorch/pytorch/pull/158351))
104+
- Enable more tests ([\#158703](https://github.com/pytorch/pytorch/pull/158703))
105+
- Fix compile benchmark correctness ([\#159731](https://github.com/pytorch/pytorch/pull/159731))
106+
- Remove unused size12 variable ([\#159832](https://github.com/pytorch/pytorch/pull/159832))
107+
- Combine all pre-MacOS14 xfail lists ([\#160228](https://github.com/pytorch/pytorch/pull/160228))
108+
- Add `simd_[arg][max|min]` ([\#158990](https://github.com/pytorch/pytorch/pull/158990))
109+
- Add fused\_rms and sdpa\_mps fallback ops for AOTInductor ([\#156844](https://github.com/pytorch/pytorch/pull/156844))
110+
- Update `avg_pool3d` kernel to use `opmath_t` ([\#161071](https://github.com/pytorch/pytorch/pull/161071))
111+
112+
### security
113+

2.9.0/todo/result_mps.md

Lines changed: 0 additions & 105 deletions
This file was deleted.

0 commit comments

Comments
 (0)