[CI] Add more ported distributed cases #2082

zxd1997066 · 2025-09-19T04:09:18Z

This PR intends to add more ported distributed cases in torch-xpu-ops CI. And add pytest-xdist for distributed UT

The distributed UT time will increase to 2h20min with 2 work groups
(reference: 3h3m for 1 work group https://github.com/intel/torch-xpu-ops/actions/runs/17902859755/job/50907350984)

disable_e2e
disable_ut

chuanqi129

Please split the test scope as CI scope and nightly full scope

chuanqi129 · 2025-09-29T06:49:15Z

.github/actions/get-runner/action.yml


+inputs:
+  ut_name:
+    required: true


Suggested change

required: true

required: false

chuanqi129 · 2025-09-29T06:52:25Z

.github/actions/get-runner/action.yml

-                  ze = xpu_list[i+1];
-              } else {
-                  ze = i;
+        if [ "${{ inputs.ut_name }}" == "xpu_distributed" ];then


Is there any assumptions in here? Can we detect topology directly and dynamically on the test node?
Please consider below scenarios:

No Xelink group, return failed

1 Xelink group, launch 1 worker

2 Xelink group, launch 2 workers

...

chuanqi129 · 2025-09-29T06:52:55Z

.github/workflows/_linux_ut.yml

  runner:
    runs-on: ${{ inputs.runner }}
-    name: get-runner
+    name: get-runner 


why we have such change?

zxd1997066 force-pushed the xiangdong/ported_cases branch 2 times, most recently from cb15ece to 1cbc6b9 Compare September 22, 2025 02:34

add more fsdp,fsdp2 cases

5ab5068

zxd1997066 force-pushed the xiangdong/ported_cases branch 10 times, most recently from c5009f3 to 0d9b54f Compare September 25, 2025 09:35

add distributed UT xdist

85fa6f1

zxd1997066 force-pushed the xiangdong/ported_cases branch from 0d9b54f to 85fa6f1 Compare September 25, 2025 14:29

zxd1997066 requested review from chuanqi129 and daisyden September 26, 2025 01:26

chuanqi129 reviewed Sep 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI] Add more ported distributed cases #2082

[CI] Add more ported distributed cases #2082

Uh oh!

zxd1997066 commented Sep 19, 2025 •

edited

Loading

Uh oh!

chuanqi129 left a comment

Uh oh!

chuanqi129 Sep 29, 2025

Uh oh!

chuanqi129 Sep 29, 2025

Uh oh!

chuanqi129 Sep 29, 2025

Uh oh!

Uh oh!

[CI] Add more ported distributed cases #2082

Are you sure you want to change the base?

[CI] Add more ported distributed cases #2082

Uh oh!

Conversation

zxd1997066 commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chuanqi129 left a comment

Choose a reason for hiding this comment

Uh oh!

chuanqi129 Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

chuanqi129 Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

chuanqi129 Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zxd1997066 commented Sep 19, 2025 •

edited

Loading