Add functional test for 2D block lowering of tensor pointers. #3876

chengjunlu · 2025-04-09T08:03:54Z

Add functional test for 2D block lowering of tensor pointers.

Copilot

Copilot reviewed 1 out of 2 changed files in this pull request and generated no comments.

Files not reviewed (1)

third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp: Language not supported

Comments suppressed due to low confidence (2)

python/test/unit/intel/test_block_load.py:25

[nitpick] The string representation in the str method uses inconsistent spacing around '='. Consider standardizing the formatting for clarity in logging and debugging.

return f"#triton_intel_gpu.dpas<{ {repeatCount={self.repeatCount}, systolicDepth={self.systolic_depth}, executionSize = {self.execution_size}, opsPerChan = {self.ops_per_chan}, threadsPerWarp = {self.threads_per_warp}, warpsPerCTA={self.warps_per_cta}, repCluster={self.rep_cluster}}}>"

python/test/unit/intel/test_block_load.py:199

There is a commented-out assertion intended to check for 2D block io support. Either uncomment and enable this assertion to validate the functionality or remove it to avoid confusion.

# assert '2d block io' in kernel.asm['llir']

whitneywhtsang · 2025-04-09T12:14:12Z

python/test/unit/intel/test_block_load.py

+               warps_per_cta=[8, 4], rep_cluster=[4, 2]),
+    DpasLayout(repeatCount=8, systolic_depth=8, execution_size=16, ops_per_chan=1, threads_per_warp=32,
+               warps_per_cta=[8, 4], rep_cluster=[1, 1]),
+    # Layout for Xe


Suggested change

# Layout for Xe

whitneywhtsang · 2025-04-09T12:19:09Z

python/test/unit/intel/test_block_load.py

+    if support_block_io:
+        # assert '2d block io' in kernel.asm['llir']
+        pass


Why do we early exit when block io is supported?

Right. Also, the temp file will be left behind and should be removed.

The check is going to make sure the 2D block IO is used. It based on the SPIRV extension. Right now there are too many unsupported 2D block IO variant of OCL interface.

icic, if we merge this PR before SPIRV extension is ready, then we probably want to remove line 199-201.

whitneywhtsang · 2025-04-09T12:21:06Z

python/test/unit/intel/test_block_load.py

+
+@pytest.mark.parametrize("M, N", [[M, N] for M, N in itertools.product([32, 64, 128, 256], [32, 64, 128, 256])])
+@pytest.mark.parametrize("dtype_str", ["float32", "float16", "int8"])
+@pytest.mark.parametrize("layout", layouts)


should we check for the same dpas layouts for test_block_load_dpas_layout?

chengjunlu · 2025-04-11T02:42:05Z

Convert to draft.
The test case based on #3896 and #3751

…uctured memory. Fix segfault in LoadOpToBlockIOConversion Signed-off-by: Whitney Tsang <[email protected]>

and old implementation)

etiotto · 2025-07-15T18:39:45Z

@chengjunlu can we close this PR ?

chengjunlu requested review from Copilot, etiotto, whitneywhtsang and alexbaden April 9, 2025 08:04

Copilot AI reviewed Apr 9, 2025

View reviewed changes

whitneywhtsang reviewed Apr 9, 2025

View reviewed changes

chengjunlu force-pushed the chengjun/add_tensor_pointer_load_test_case branch 7 times, most recently from 28e9720 to c9022fc Compare April 11, 2025 02:40

chengjunlu marked this pull request as draft April 11, 2025 02:41

chengjunlu force-pushed the chengjun/add_tensor_pointer_load_test_case branch from c9022fc to 25e3a8b Compare April 11, 2025 05:01

chengjunlu and others added 8 commits April 11, 2025 12:53

Add unit test for lowering the tt.load of tensor of pointers with str…

51983d2

…uctured memory. Fix segfault in LoadOpToBlockIOConversion Signed-off-by: Whitney Tsang <[email protected]>

[Intel] Rework load-store redundant data masking 1/?

3842183

[Intel] Rework load-store redundant data masking 2/?

028f868

[Intel] Rework load-store redundant data masking 3/? (atomic cas)

41cc61f

[Intel] Rework load-store redundant data masking 4/? (atomic rmw)

593343b

[Intel] Rework load-store redundant data masking 5/5 (remove debug code

921f6d0

and old implementation)

fixup lit tests

25e3a8b

Add debug disable fast math

c7600c6

chengjunlu mentioned this pull request Apr 25, 2025

Fix regression issue in flex decoding. #3999

Merged

chengjunlu closed this Jul 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add functional test for 2D block lowering of tensor pointers. #3876

Add functional test for 2D block lowering of tensor pointers. #3876

chengjunlu commented Apr 9, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

whitneywhtsang Apr 9, 2025

Uh oh!

whitneywhtsang Apr 9, 2025

Uh oh!

etiotto Apr 9, 2025

Uh oh!

chengjunlu Apr 11, 2025

Uh oh!

whitneywhtsang Apr 11, 2025

Uh oh!

whitneywhtsang Apr 9, 2025

Uh oh!

chengjunlu commented Apr 11, 2025

Uh oh!

etiotto commented Jul 15, 2025

Uh oh!

Uh oh!

Add functional test for 2D block lowering of tensor pointers. #3876

Add functional test for 2D block lowering of tensor pointers. #3876

Conversation

chengjunlu commented Apr 9, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

whitneywhtsang Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

whitneywhtsang Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

etiotto Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

chengjunlu Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

whitneywhtsang Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

whitneywhtsang Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

chengjunlu commented Apr 11, 2025

Uh oh!

etiotto commented Jul 15, 2025

Uh oh!

Uh oh!