Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
bd5857a
Revert "[inductor] Estimate peak memory allocfree and applying to reo…
pytorchmergebot Aug 21, 2025
acb00d3
Revert "Fix torchaudio build when TORCH_CUDA_ARCH_LIST is not set (#1…
pytorchmergebot Aug 21, 2025
a941d7f
[Quant][CPU] Avoid NaN in fp8 output of qlinear and qconv (#160957)
Xia-Weiwen Aug 21, 2025
1827114
[dist] expose unsafe_get_ptr for dist.ProcessGroupNCCL.NCCLConfig (#1…
youkaichao Aug 21, 2025
3caddd4
[ROCm] SDPA fix mem fault when dropout is enabled (#154864)
alugorey Aug 21, 2025
517d38d
[inductor] Estimate peak memory allocfree and applying to reordering …
IvanKobzarev Aug 18, 2025
7006fd0
Revert "[inductor] Estimate peak memory allocfree and applying to reo…
pytorchmergebot Aug 21, 2025
a6401cb
Revert "flip the list-as-tuple behavior for short lists (#160794)"
pytorchmergebot Aug 21, 2025
3dacaf0
[aoti-fx] Add meta["val"] metadata (#161019)
angelayi Aug 21, 2025
3f5a8e2
Fix torchaudio build when TORCH_CUDA_ARCH_LIST is not set (#161084)
huydhn Aug 21, 2025
9668210
Allow bypasses for Precompile when guards, etc. cannot be serialized …
jamesjwu Aug 20, 2025
958f9ca
[nativert] oss static kernel tests (#161087)
dolpm Aug 21, 2025
8018510
[pytorch] Invoke `vector.reserve()` consistently for non-inplace fore…
tsunghsienlee Aug 21, 2025
a445b41
[pytorch] Simplify PyTorch `foreach_*` API restrictions check (#161039)
tsunghsienlee Aug 21, 2025
1e3fe78
[inductor] disable min/max macro on Windows. (#161133)
xuhancn Aug 21, 2025
db38c44
[inductor] add libraries_dirs for level_zero (#161146)
xuhancn Aug 21, 2025
5805c42
[invoke_subgraph][inductor] Thread graphsafe rng input states for hop…
anijain2305 Aug 15, 2025
e25ee02
Fix constant_pad_nd_mps bug when pad is empty (#161149)
can-gaa-hou Aug 21, 2025
d2b8c0d
forward fix of #152198 (#161166)
jagadish-amd Aug 21, 2025
fb241d0
[dcp][hf] Fix multi-rank consolidation for no files to process case (…
ankitageorge Aug 21, 2025
67fc16c
Add profiler analysis flag to combine multiple profiles into one (#16…
exclamaforte Aug 21, 2025
cc2b65a
[VLLM]setup test cli logics (#160361)
yangw-dev Aug 21, 2025
d1faf2e
[DTensor] Make default RNG semantics match user-passed generator (#16…
wconstab Aug 21, 2025
f085f29
[Inductor] Update Outer Reduction Heuristic (#159093)
PaulZhang12 Aug 21, 2025
cb57953
[BE] Enable `test_index_put_accumulate_duplicate_indices` on MPS (#16…
malfet Aug 21, 2025
fc0683b
Revert "[ATen][CPU][Sparse] Use Third-Party Eigen for sparse add and …
pytorchmergebot Aug 21, 2025
f5bf514
Bump uv from 0.8.4 to 0.8.6 in /.ci/lumen_cli (#161212)
dependabot[bot] Aug 21, 2025
46429be
[DCP][HF] Add option to parallelize reads in HF Storage Reader (#160205)
ankitageorge Aug 21, 2025
ff4f5dd
[nativert] oss layout planner tests (#160942)
dolpm Aug 22, 2025
a85711d
Avoid making node a successor/predecessor of itself (#161205)
eellison Aug 21, 2025
be2e6b3
[export] Remove unused Model, tensor_paths, constant_paths (#161185)
yiming0416 Aug 22, 2025
cc791d5
Quick fix to headers in stable/tensor_inl.h (#161168)
janeyx99 Aug 21, 2025
31a41da
[ROCm][Windows] Include native_transformers srcs to fix link errors. …
ScottTodd Aug 22, 2025
2fdd4f9
Log exception_stack_trace to dynamo_compile (#161096)
jovianjaison Aug 22, 2025
bf8431b
[inductor][cpu] Fix double-offset issue in `GEMM_TEMPLATE` (#159233)
Phoslight Aug 22, 2025
c60dea5
[export] Allow tempfile._TemporaryFileWrapper in package_pt2 (#161203)
yiming0416 Aug 22, 2025
c7fb031
[audio hash update] update the pinned audio hash (#161226)
pytorchupdatebot Aug 22, 2025
8aad3a6
[dynamo] propagate tensor metadata on Tensor.__setitem__(tensor) (#16…
xmfan Aug 21, 2025
0dea191
[VLLM TEST]setup test workflow (#160583)
yangw-dev Aug 22, 2025
bc7eaa0
[BE] Remove the default TORCH_CUDA_ARCH_LIST in CI Docker image (#161…
huydhn Aug 22, 2025
f8bd858
Optimzie `zero_grad` description (#161239)
zeshengzong Aug 22, 2025
9b3ebd2
[inductor] Enable max compatible to msvc for oneAPI headers. (#161196)
xuhancn Aug 22, 2025
c4670e4
[inductor] remove Windows unsupported build options. (#161197)
xuhancn Aug 22, 2025
595987d
[bucketing] allow convert_element_type after fsdp reduce_scatter (#16…
IvanKobzarev Aug 21, 2025
373e25c
Disable background threads for XPU host allocator (#161242)
guangyey Aug 22, 2025
9e491f7
[dynamo] Remove extra if statement in builder _wrap (#161215)
azahed98 Aug 22, 2025
9b4adc4
[fr] [xpu] Add FlightRecorder support for ProcessGroupXCCL (#158568)
frost-intel Aug 22, 2025
2beffb3
Refactoring TensorImpl by using constexpr and std::is_same_v (#161043)
fffrog Aug 20, 2025
774b4be
[BE] [dynamo] Simplify two methods in ConstDictVariable (#159361)
rec Aug 19, 2025
a68f63e
Add Windows CUDA 13 build and magma script (#161073)
tinglvv Aug 22, 2025
49ff884
Add CUDA 13.0 x86 builds (#160956)
tinglvv Aug 22, 2025
639b8cc
Revert "cd: Add no-cache for test binaries (#149218)"
pytorchmergebot Aug 22, 2025
db44de4
[inductor] Estimate peak memory allocfree and applying to reordering …
IvanKobzarev Aug 22, 2025
ce467df
rm platform args xplat/langtech/mobile/BUCK (#161018)
rexzhang123 Aug 22, 2025
c7a7747
Revert "[DTensor] Make default RNG semantics match user-passed genera…
pytorchmergebot Aug 22, 2025
7fcdd8d
Use ROCm MI325 runners for trunk.yml (#161184)
jithunnair-amd Aug 22, 2025
f09458c
Enable `test/test_numpy_interop.py` config in mypy (#158556)
zeshengzong Aug 22, 2025
c239008
[MPS] Fix index_select for scalar_types (#161206)
malfet Aug 21, 2025
e20f6d7
Move non inductor workflows to Python 3.9 -> 3.10 (#161182)
atalman Aug 22, 2025
25df65a
[ROCm] revamp HIPCachingAllocatorMasqueradingAsCUDA (#161221)
jeffdaily Aug 22, 2025
266784e
remove old while_loop_schema_gen test (#161202)
ydwu4 Aug 21, 2025
1d458e2
Revert "[Inductor] Update Outer Reduction Heuristic (#159093)"
pytorchmergebot Aug 22, 2025
97200c9
[inductor] Add get page_size support for Windows. (#161273)
xuhancn Aug 22, 2025
17b0263
[inductor] fix march=native pass to Windows CC. (#161264)
xuhancn Aug 22, 2025
a43480d
[CD] Enable triton xpu Windows build for Python 3.14 (#161255)
chuanqi129 Aug 22, 2025
eba1ad0
Revert "[SymmMem] Support rendezvous on view of a tensor (#160925)"
pytorchmergebot Aug 22, 2025
2c0650a
Revert "[BE][inductor] tl.dot(..., allow_tf32=...) -> tl.dot(..., inp…
pytorchmergebot Aug 22, 2025
3ea6cc8
Fix conv exhaustive autotuning and expand Exhaustive test coverage (#…
exclamaforte Aug 22, 2025
981ac53
Revert "Close some sources of fake tensor leakages (#159923)"
pytorchmergebot Aug 22, 2025
3f1a97a
Revert "[dynamic shapes] unbacked-safe slicing (#157944)"
pytorchmergebot Aug 22, 2025
2835cc5
[cuDNN] head dim > 128 works on H100 again in cuDNN SDPA? (#161210)
eqy Aug 22, 2025
9d882fd
[benchmark] Add torchscript jit.trace to benchmark option (#161223)
yiming0416 Aug 22, 2025
4c36c8a
[dynamo] Support method calls on complex ConstantVariables (#161122)
rtimpe Aug 22, 2025
c8bb0e4
[MPS] Fix `index_copy` for scalars (#161267)
malfet Aug 22, 2025
3373b07
[Profiler] Add GC Events to Python Stack Tracer (#161209)
sraikund16 Aug 22, 2025
419a2db
[ONNX] Remove enable_fake_mode and exporter_legacy (#161222)
justinchuby Aug 22, 2025
bcfe1b2
Add initial bc-linter configuration (#161319)
izaitsevfb Aug 22, 2025
f521e82
Update pyrefly config for better codenav (#161200)
lolpack Aug 22, 2025
0d9da38
Bump onnxscript to 0.4.0 in CI (#161312)
justinchuby Aug 22, 2025
47d2673
Revert "[SymmMem] Support rendezvous on slice of a tensor (#160825)"
pytorchmergebot Aug 22, 2025
cee7211
[Test] Adding a testcase for constant_pad_nd (#161259)
can-gaa-hou Aug 23, 2025
d228a77
[Inductor-FX] Support Tensorbox outputs (#161245)
blaine-rister Aug 23, 2025
121afd6
[MPS] Update `avg_pool2d` to use Metal kernel when `ceil_mode=True` (…
kurtamohler Aug 20, 2025
394728b
[MPS] Update `avg_pool3d` kernel to use `opmath_t` (#161071)
kurtamohler Aug 20, 2025
38a492d
[ONNX] Remove unused _onnx_supported_ops (#161322)
justinchuby Aug 22, 2025
ac8d941
[audio hash update] update the pinned audio hash (#161331)
pytorchupdatebot Aug 23, 2025
7131bfa
[vllm hash update] update the pinned vllm hash (#161227)
pytorchupdatebot Aug 23, 2025
36ac916
[ONNX] Fix lower opset version support in dynamo=True (#161056)
justinchuby Aug 23, 2025
6443ea3
enable more tests (#161192)
yangw-dev Aug 23, 2025
3a4140b
[FlexAttention] fixing learnable bias assertion error in inductor (#1…
liangel-02 Aug 23, 2025
22df59e
[inductor] add MSVC language pack check. (#161298)
xuhancn Aug 23, 2025
710514a
Revert "Enable output padding when only outermost dim is dynamic (#15…
pytorchmergebot Aug 23, 2025
cd31be2
Reland D80238201: [Torch.Export] Add flat arg paths in error message …
malaybag Aug 23, 2025
431846a
[AMD] Fix AMD User Defined Kernel Autotune (#160671)
oniononion36 Aug 23, 2025
33346b5
Support NUMA Binding for Callable Entrypoints, Take 2 (#161183)
pdesupinski Aug 23, 2025
f912c93
Revert "Move non inductor workflows to Python 3.9 -> 3.10 (#161182)"
pytorchmergebot Aug 23, 2025
4acdbb8
[MPS] Fix index_copy for strided indices (#161333)
malfet Aug 23, 2025
3e5b021
[ATen][CPU][Sparse] Use Third-Party Eigen for sparse add and addmm (#…
Aidyn-A Aug 23, 2025
1de4540
Use -compress-mode=size for CUDA 13 build for binary size reduction (…
tinglvv Aug 24, 2025
74280d0
[muon] Introduce Muon optimizer to PyTorch (#160213)
chuanhaozhuge Aug 24, 2025
726dce3
[nccl symm mem] don't use arg for mempool, correctly use symmetric re…
ngimel Aug 25, 2025
e3d68df
[DTensor] Make default RNG semantics match user-passed generator (#16…
wconstab Aug 24, 2025
80df27a
port distributed pipeline test files for Intel GPU (#159033)
wincent8 Aug 25, 2025
cb61c1a
enable symm on xpu
zhangxiaoli73 Jun 30, 2025
27a2420
enable fused matmul+reducescatter
zhangxiaoli73 Jul 3, 2025
7bb8e78
refine fused matmul and reducescatter
zhangxiaoli73 Jul 10, 2025
0339c94
fp8 scaled support
zhangxiaoli73 Jul 17, 2025
1b514dc
format
Chao1Han Jul 31, 2025
97b55d1
Enable XPU built with Level Zero (#321)
guangyey Jul 10, 2025
f4b119e
stage < TP
Chao1Han Aug 5, 2025
9bd58a6
register op
Chao1Han Aug 13, 2025
c1d75fb
update
Chao1Han Aug 25, 2025
3cc755e
Revert "Enable XPU built with Level Zero (#321)"
Chao1Han Sep 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .bc-linter.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
version: 1
paths:
include:
- "**/*.py"
exclude:
- ".*"
- ".*/**"
- "**/.*/**"
- "**/.*"
- "**/_*/**"
- "**/_*.py"
- "**/test/**"
- "**/benchmarks/**"
- "**/test_*.py"
- "**/*_test.py"
2 changes: 1 addition & 1 deletion .ci/docker/ci_commit_pins/torchbench.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
22bc29b4d503fc895ff73bc720ff396e9723465f
e03a63be43e33596f7f0a43b0f530353785e4a59
2 changes: 1 addition & 1 deletion .ci/docker/common/install_onnx.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ pip_install \

pip_install coloredlogs packaging
pip_install onnxruntime==1.22.1
pip_install onnxscript==0.3.1
pip_install onnxscript==0.4.0

# Cache the transformers model to be used later by ONNX tests. We need to run the transformers
# package to download the model. By default, the model is cached at ~/.cache/huggingface/hub/
Expand Down
5 changes: 5 additions & 0 deletions .ci/docker/libtorch/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,11 @@ RUN bash ./install_cuda.sh 12.9
RUN bash ./install_magma.sh 12.9
RUN ln -sf /usr/local/cuda-12.9 /usr/local/cuda

FROM cuda as cuda13.0
RUN bash ./install_cuda.sh 13.0
RUN bash ./install_magma.sh 13.0
RUN ln -sf /usr/local/cuda-13.0 /usr/local/cuda

FROM cpu as rocm
ARG ROCM_VERSION
ARG PYTORCH_ROCM_ARCH
Expand Down
6 changes: 6 additions & 0 deletions .ci/docker/manywheel/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,12 @@ case ${image} in
DOCKER_GPU_BUILD_ARG="--build-arg BASE_CUDA_VERSION=${GPU_ARCH_VERSION} --build-arg DEVTOOLSET_VERSION=13"
MANY_LINUX_VERSION="2_28"
;;
manylinux2_28-builder:cuda13*)
TARGET=cuda_final
GPU_IMAGE=amd64/almalinux:8
DOCKER_GPU_BUILD_ARG="--build-arg BASE_CUDA_VERSION=${GPU_ARCH_VERSION} --build-arg DEVTOOLSET_VERSION=13"
MANY_LINUX_VERSION="2_28"
;;
manylinuxaarch64-builder:cuda*)
TARGET=cuda_final
GPU_IMAGE=amd64/almalinux:8
Expand Down
2 changes: 1 addition & 1 deletion .ci/docker/requirements-ci.txt
Original file line number Diff line number Diff line change
Expand Up @@ -339,7 +339,7 @@ onnx==1.18.0
#Pinned versions:
#test that import:

onnxscript==0.3.1
onnxscript==0.4.0
#Description: Required by mypy and test_public_bindings.py when checking torch.onnx._internal
#Pinned versions:
#test that import:
Expand Down
1 change: 0 additions & 1 deletion .ci/docker/ubuntu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,6 @@ COPY --from=pytorch/llvm:9.0.1 /opt/llvm /opt/llvm
RUN if [ -n "${SKIP_LLVM_SRC_BUILD_INSTALL}" ]; then set -eu; rm -rf /opt/llvm; fi

# AWS specific CUDA build guidance
ENV TORCH_CUDA_ARCH_LIST Maxwell
ENV TORCH_NVCC_FLAGS "-Xfatbin -compress-all"
ENV CUDA_PATH /usr/local/cuda

Expand Down
2 changes: 1 addition & 1 deletion .ci/lumen_cli/cli/build_cli/register_build.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import logging

from cli.lib.common.cli_helper import register_targets, RichHelp, TargetSpec
from cli.lib.core.vllm import VllmBuildRunner
from cli.lib.core.vllm.vllm_build import VllmBuildRunner


logger = logging.getLogger(__name__)
Expand Down
71 changes: 71 additions & 0 deletions .ci/lumen_cli/cli/lib/common/pip_helper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
import glob
import logging
import shlex
import shutil
import sys
from collections.abc import Iterable
from importlib.metadata import PackageNotFoundError, version
from typing import Optional, Union

from cli.lib.common.utils import run_command


logger = logging.getLogger(__name__)


def pip_install_packages(
packages: Iterable[str] = (),
env=None,
*,
requirements: Optional[str] = None,
constraints: Optional[str] = None,
prefer_uv: bool = False,
) -> None:
use_uv = prefer_uv and shutil.which("uv") is not None
base = (
[sys.executable, "-m", "uv", "pip", "install"]
if use_uv
else [sys.executable, "-m", "pip", "install"]
)
cmd = base[:]
if requirements:
cmd += ["-r", requirements]
if constraints:
cmd += ["-c", constraints]
cmd += list(packages)
logger.info("pip installing packages: %s", " ".join(map(shlex.quote, cmd)))
run_command(" ".join(map(shlex.quote, cmd)), env=env)


def pip_install_first_match(pattern: str, extras: Optional[str] = None, pref_uv=False):
wheel = first_matching_pkg(pattern)
target = f"{wheel}[{extras}]" if extras else wheel
logger.info("Installing %s...", target)
pip_install_packages([target], prefer_uv=pref_uv)


def run_python(args: Union[str, list[str]], env=None):
"""
Run the python in the current environment.
"""
if isinstance(args, str):
args = shlex.split(args)
cmd = [sys.executable] + args
run_command(" ".join(map(shlex.quote, cmd)), env=env)


def pkg_exists(name: str) -> bool:
try:
pkg_version = version(name)
logger.info("%s already exist with version: %s", name, pkg_version)
return True
except PackageNotFoundError:
logger.info("%s is not installed", name)
return False


def first_matching_pkg(pattern: str) -> str:
matches = sorted(glob.glob(pattern))
if not matches:
raise FileNotFoundError(f"No wheel matching: {pattern}")
return matches[0]
38 changes: 38 additions & 0 deletions .ci/lumen_cli/cli/lib/common/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import shlex
import subprocess
import sys
from contextlib import contextmanager
from typing import Optional


Expand Down Expand Up @@ -77,3 +78,40 @@ def str2bool(value: Optional[str]) -> bool:
if value in false_value_set:
return False
raise ValueError(f"Invalid string value for boolean conversion: {value}")


@contextmanager
def temp_environ(updates: dict[str, str]):
"""
Temporarily set environment variables and restore them after the block.
Args:
updates: Dict of environment variables to set.
"""
missing = object()
old: dict[str, str | object] = {k: os.environ.get(k, missing) for k in updates}
try:
os.environ.update(updates)
yield
finally:
for k, v in old.items():
if v is missing:
os.environ.pop(k, None)
else:
os.environ[k] = v # type: ignore[arg-type]


@contextmanager
def working_directory(path: str):
"""
Temporarily change the working directory inside a context.
"""
if not path:
# No-op context
yield
return
prev_cwd = os.getcwd()
try:
os.chdir(path)
yield
finally:
os.chdir(prev_cwd)
Loading