Skip to content

Commit 573f426

Browse files
committed
Fixed a bug in build script. Removed ubuntu-cuda folder, instead using ubuntu folder for cuda Dockerfile.
1 parent 4fa2c49 commit 573f426

File tree

3 files changed

+4
-49
lines changed

3 files changed

+4
-49
lines changed

.ci/docker/build.sh

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,7 @@ OS=ubuntu
1717

1818
# set Dockerfile
1919
DOCKERFILE="${OS}/Dockerfile"
20-
if [[ "$IMAGE_NAME" == *cuda* ]]; then
21-
DOCKERFILE="${OS}-cuda/Dockerfile"
22-
elif [[ "$IMAGE_NAME" == *rocm* ]]; then
20+
if [[ "$IMAGE_NAME" == *rocm* ]]; then
2321
DOCKERFILE="${OS}-rocm/Dockerfile"
2422
fi
2523

.ci/docker/ubuntu-cuda/Dockerfile

Lines changed: 0 additions & 41 deletions
This file was deleted.

.github/workflows/integration_test_8gpu_h100_rocm.yaml renamed to .github/workflows/integration_test_8gpu_rocm.yaml

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: 8 GPU Integration Test at H100
1+
name: 8 GPU Integration Test
22

33
on:
44
push:
@@ -17,13 +17,11 @@ defaults:
1717

1818
jobs:
1919
build-test:
20-
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
20+
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main
2121
with:
2222
runner: linux.rocm.gpu.mi300.8
2323
gpu-arch-type: rocm
2424
gpu-arch-version: "6.4"
25-
# This image is faster to clone than the default, but it lacks CC needed by triton
26-
# (1m25s vs 2m37s).
2725
docker-image: torchtitan-rocm-pytorch-nightly-ubuntu-22.04-clang19-py3
2826
repository: pytorch/torchtitan
2927
upload-artifact: outputs
@@ -33,5 +31,5 @@ jobs:
3331
USE_CPP=0 python -m pip install --pre torchao
3432
3533
mkdir artifacts-to-be-uploaded
36-
python ./tests/integration_tests_h100.py artifacts-to-be-uploaded --ngpu 8
34+
python ./tests/integration_tests.py artifacts-to-be-uploaded --ngpu 8
3735

0 commit comments

Comments
 (0)