Skip to content

Conversation

Chao1Han
Copy link
Contributor

No description provided.

@Copilot Copilot AI review requested due to automatic review settings September 16, 2025 06:50
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a test for XCCL high priority stream functionality to the distributed communication test suite. It introduces a new test method that validates XCCL process group options, specifically testing the is_high_priority_stream configuration.

  • Adds _test_pass_xccl_options helper method for testing XCCL process group options
  • Implements test_pass_xccl_options_high_priority_stream to verify high priority stream functionality

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

# test the process group works as expected
t = torch.tensor([self.rank + 1] * 10).xpu(self.rank)
pg.allreduce(t).wait()
expected_tensor = torch.tensor([3] * 10).xpu(self.rank)
Copy link
Preview

Copilot AI Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 3 in the expected tensor should be calculated explicitly. For a 2-GPU setup (world_size=2), the sum would be (0+1) + (1+1) = 3, but this calculation isn't obvious. Consider using sum(range(self.world_size)) + self.world_size or adding a comment explaining the calculation.

Suggested change
expected_tensor = torch.tensor([3] * 10).xpu(self.rank)
# The expected value is the sum of (rank + 1) for all ranks in the group
expected_value = sum(rank + 1 for rank in range(self.world_size))
expected_tensor = torch.tensor([expected_value] * 10).xpu(self.rank)

Copilot uses AI. Check for mistakes.

github-merge-queue bot pushed a commit that referenced this pull request Sep 17, 2025
Support high priority stream for xccl, test case add in
#2049
We need merge this pr first and upstream op register
pytorch/pytorch#163049 and then test case could
be pass

---------

Co-authored-by: mengfei25 <[email protected]>
mengfei25 added a commit that referenced this pull request Sep 17, 2025
Support high priority stream for xccl, test case add in
#2049
We need merge this pr first and upstream op register
pytorch/pytorch#163049 and then test case could
be pass

---------

Co-authored-by: mengfei25 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant