Skip to content

Cover DispatchRays index/dimensions + SBT miss/hit-group routing#1277

Merged
EmilioLaiso merged 1 commit into
llvm:mainfrom
Traverse-Research:rt-pso-tests-dispatch
Jun 26, 2026
Merged

Cover DispatchRays index/dimensions + SBT miss/hit-group routing#1277
EmilioLaiso merged 1 commit into
llvm:mainfrom
Traverse-Research:rt-pso-tests-dispatch

Conversation

@MarijnS95

@MarijnS95 MarijnS95 commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Depends on #1281

Summary

Four small PSO raytracing tests stacked on top of #1275, each isolating one shader-observable surface from the 👍 list in #1268. Same shape as the inline-RT batch already in flight in #1271 / #1272 / #1274 / #1276 — one .test file per behavior, single-purpose shader, exact buffer comparison.

  • dispatch-rays-index.test — 4x1x1 dispatch, raygen writes DispatchRaysIndex().x into Output[index]. Confirms the dispatch grid plumbs through to the per-lane system value with no BLAS / TLAS / hit groups in play (RT-pipeline-only, no AS binding).
  • dispatch-rays-dimensions.test — 2x3x1 dispatch, raygen packs the constant DispatchRaysDimensions() into one uint per lane. Confirms every lane sees the host-side {W, H, D} even when only one dimension > 1.
  • miss-shader-index.test — two miss shaders writing distinct sentinels (0xAA / 0xBB). 2-lane dispatch picks MissShaderIndex 0 and 1 respectively; rays start far enough from the geometry that every ray misses. Verifies the SBT miss region's per-record routing.
  • ray-contribution-to-hit-group-index.test — two hit groups with distinct closest-hit shaders (0xA1 / 0xB2). 2-lane dispatch picks RayContributionToHitGroupIndex 0 and 1, every ray hits the same triangle. Verifies the SBT hit-group region's per-record routing.

The first two have no AS / Miss / HitGroup in their pipeline at all — just a raygen + a UAV — which doubles as a regression check for the minimum viable RT pipeline shape (one raygen group, zero-sized miss / hit / callable SBT regions). The latter two reuse the single-triangle BLAS / TLAS from raygen-roundtrip.test.

All four tests are # REQUIRES: raytracing-pipeline with # XFAIL: Clangclang-dxc doesn't yet lower [shader("…")] entry points to either DXIL libraries or SPIR-V. With the Metal RT bring-up in #1281 rebased underneath this branch, all four pass natively on Apple Silicon and Metal is dropped from the XFAIL list.

Test plan

Local on an NVIDIA RTX 3060:

  • Linux Vulkan (native offloader)
  • Linux D3D12 (Wine + vkd3d-proton + cross-compiled offloader.exe)
  • Windows Vulkan (native offloader.exe)
  • Windows D3D12 (native offloader.exe)

CI (RT-capable runners):

  • windows-nvidia D3D12 (RaytracingTier 1.2)
  • windows-intel VK (VK_KHR_ray_tracing_pipeline)
  • macOS Metal (supportsRaytracing)

MarijnS95 added a commit to Traverse-Research/offload-test-suite that referenced this pull request Jun 3, 2026
…ldRay

Three small PSO RT tests stacked on llvm#1275, each isolating one shader-
observable closest-hit system value from llvm#1268's 👍 list. Same shape as
the prior batch in llvm#1277 — one .test file per behavior, single-purpose
shader, exact buffer comparison.

  - `closest-hit-barycentrics.test` — 3-lane dispatch, each lane fires
    at a clearly-interior point of the single triangle so the closest-
    hit shader reports a known `BuiltInTriangleIntersectionAttributes
    ::barycentrics` (u, v). Points are picked from the inside of the
    triangle to avoid the watertight-traversal edge-rule lottery you
    hit at edge midpoints / vertices (the first cut of this test used
    midpoint(v0, v1) and one lane silently missed on both backends).
  - `closest-hit-primitive-index.test` — three triangles tiled at x =
    -3, 0, +3 in a single BLAS. 3-lane dispatch fires straight down at
    each triangle's centroid; the closest-hit reports `PrimitiveIndex()`
    and must match the lane index 0..2.
  - `closest-hit-world-ray.test` — 2-lane dispatch with rays from
    different z heights (1.0 and 2.0). Closest-hit packs
    `WorldRayOrigin().z`, `WorldRayDirection().z`, and `RayTCurrent()`
    through the payload; raygen flattens the float3 into a 6-element
    Float32 buffer. Verifies the system values match the raygen-side
    `RayDesc` and that t is correctly computed by the traversal.

All three are `# REQUIRES: raytracing-pipeline` with `# XFAIL: Clang` —
`clang-dxc` doesn't yet lower `[shader(…)]` entry points. With the Metal
RT bring-up rebased on top, all three pass natively on Apple Silicon
and Metal is dropped from the XFAIL list.

Locally verified end-to-end on the user's Linux box: all three pass on
Vulkan via the native offloader, and on D3D12 via Wine + vkd3d-proton +
the cross-compiled `offloader.exe`, against an NVIDIA RTX 3060. And on
macOS 15 / metal-irconverter 3.1.1 via the native offloader: all three
PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MarijnS95 added a commit to Traverse-Research/offload-test-suite that referenced this pull request Jun 3, 2026
Three tests stacked on llvm#1275 covering features that inline RT (RayQuery
in compute) physically can't express — they're only reachable through
a DispatchRays-driven RT pipeline.

  - `trace-ray-recursion.test` — closest-hit fires a secondary TraceRay
    from an above-the-triangle origin. First-level CH sees payload=0 →
    bumps to 0x1 → calls TraceRay. Second-level CH sees payload!=0 →
    writes 0x10. Unwinds: first-level OR's in 0x100. Final payload
    0x110 (272 decimal). `RayTracingPipelineConfig.MaxTraceRecursion-
    Depth: 2` so both TraceRay calls are within budget.
  - `ray-flag-skip-closest-hit.test` — two lanes fire identical rays at
    the same triangle. Lane 0 uses RAY_FLAG_NONE so CH runs and writes
    0xBEEF. Lane 1 uses RAY_FLAG_SKIP_CLOSEST_HIT_SHADER (PSO-only —
    inline RT has no equivalent) so CH is skipped and payload keeps its
    initial 0xAAAA. Output [0xBEEF, 0xAAAA].
  - `callable-shader.test` — two callable shaders writing distinct
    sentinels (0xAAAA / 0xBBBB). Each lane calls `CallShader(Idx, ...)`
    so the SBT callable region's per-record routing is exercised
    independently of the hit-group / miss routing already covered in
    llvm#1277. Callable shaders themselves don't exist in inline RT.

    This test stays `# XFAIL: Clang, Vulkan` because DXC's `-spirv`
    backend lists every callable's IncomingCallableDataKHR variable in
    every callable entry point's interface, violating VUID-Standalone-
    Spirv-IncomingCallableDataKHR-04706 and getting rejected by vk-
    CreateShaderModule. The framework's Vulkan SBT / callable path is
    correct — running `spirv-opt --remove-unused-interface-variables`
    on the DXC output cleans the SPIR-V and the test passes natively.
    Track upstream.

All three pass on Metal once the bring-up PR ahead of this commit sets
the raygen pipeline's `setMaxCallStackDepth(MaxTraceRecursionDepth)` so
nested TraceRay actually unwinds (with the default of 1, the second
TraceRay was silently dropped and the recursion test produced 0x1
instead of 0x110).

Locally verified on the user's Linux box:
  - Vulkan via the native offloader: recursion + skip-CH PASS;
    callable PASSes after spirv-opt cleanup (XFAILs from raw DXC SPIR-V
    as documented above).
  - D3D12 via Wine + vkd3d-proton + cross-compiled offloader.exe: all
    three PASS.
And on macOS 15 / metal-irconverter 3.1.1 via the native offloader:
  - All three PASS (recursion + skip-CH + callable).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MarijnS95 MarijnS95 force-pushed the rt-pso-tests-dispatch branch from 1c41940 to 29c73ee Compare June 3, 2026 13:24
@MarijnS95 MarijnS95 force-pushed the rt-pso-tests-dispatch branch from 29c73ee to 1b0010a Compare June 8, 2026 12:42
MarijnS95 added a commit to Traverse-Research/offload-test-suite that referenced this pull request Jun 8, 2026
…ldRay

Three small PSO RT tests stacked on llvm#1275, each isolating one shader-
observable closest-hit system value from llvm#1268's 👍 list. Same shape as
the prior batch in llvm#1277 — one .test file per behavior, single-purpose
shader, exact buffer comparison.

  - `closest-hit-barycentrics.test` — 3-lane dispatch, each lane fires
    at a clearly-interior point of the single triangle so the closest-
    hit shader reports a known `BuiltInTriangleIntersectionAttributes
    ::barycentrics` (u, v). Points are picked from the inside of the
    triangle to avoid the watertight-traversal edge-rule lottery you
    hit at edge midpoints / vertices (the first cut of this test used
    midpoint(v0, v1) and one lane silently missed on both backends).
  - `closest-hit-primitive-index.test` — three triangles tiled at x =
    -3, 0, +3 in a single BLAS. 3-lane dispatch fires straight down at
    each triangle's centroid; the closest-hit reports `PrimitiveIndex()`
    and must match the lane index 0..2.
  - `closest-hit-world-ray.test` — 2-lane dispatch with rays from
    different z heights (1.0 and 2.0). Closest-hit packs
    `WorldRayOrigin().z`, `WorldRayDirection().z`, and `RayTCurrent()`
    through the payload; raygen flattens the float3 into a 6-element
    Float32 buffer. Verifies the system values match the raygen-side
    `RayDesc` and that t is correctly computed by the traversal.

All three are `# REQUIRES: raytracing-pipeline` with `# XFAIL: Clang` —
`clang-dxc` doesn't yet lower `[shader(…)]` entry points. With the Metal
RT bring-up rebased on top, all three pass natively on Apple Silicon
and Metal is dropped from the XFAIL list.

Locally verified end-to-end on the user's Linux box: all three pass on
Vulkan via the native offloader, and on D3D12 via Wine + vkd3d-proton +
the cross-compiled `offloader.exe`, against an NVIDIA RTX 3060. And on
macOS 15 / metal-irconverter 3.1.1 via the native offloader: all three
PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MarijnS95 added a commit to Traverse-Research/offload-test-suite that referenced this pull request Jun 8, 2026
Three tests stacked on llvm#1275 covering features that inline RT (RayQuery
in compute) physically can't express — they're only reachable through
a DispatchRays-driven RT pipeline.

  - `trace-ray-recursion.test` — closest-hit fires a secondary TraceRay
    from an above-the-triangle origin. First-level CH sees payload=0 →
    bumps to 0x1 → calls TraceRay. Second-level CH sees payload!=0 →
    writes 0x10. Unwinds: first-level OR's in 0x100. Final payload
    0x110 (272 decimal). `RayTracingPipelineConfig.MaxTraceRecursion-
    Depth: 2` so both TraceRay calls are within budget.
  - `ray-flag-skip-closest-hit.test` — two lanes fire identical rays at
    the same triangle. Lane 0 uses RAY_FLAG_NONE so CH runs and writes
    0xBEEF. Lane 1 uses RAY_FLAG_SKIP_CLOSEST_HIT_SHADER (PSO-only —
    inline RT has no equivalent) so CH is skipped and payload keeps its
    initial 0xAAAA. Output [0xBEEF, 0xAAAA].
  - `callable-shader.test` — two callable shaders writing distinct
    sentinels (0xAAAA / 0xBBBB). Each lane calls `CallShader(Idx, ...)`
    so the SBT callable region's per-record routing is exercised
    independently of the hit-group / miss routing already covered in
    llvm#1277. Callable shaders themselves don't exist in inline RT.

    This test stays `# XFAIL: Clang, Vulkan` because DXC's `-spirv`
    backend lists every callable's IncomingCallableDataKHR variable in
    every callable entry point's interface, violating VUID-Standalone-
    Spirv-IncomingCallableDataKHR-04706 and getting rejected by vk-
    CreateShaderModule. The framework's Vulkan SBT / callable path is
    correct — running `spirv-opt --remove-unused-interface-variables`
    on the DXC output cleans the SPIR-V and the test passes natively.
    Track upstream.

All three pass on Metal once the bring-up PR ahead of this commit sets
the raygen pipeline's `setMaxCallStackDepth(MaxTraceRecursionDepth)` so
nested TraceRay actually unwinds (with the default of 1, the second
TraceRay was silently dropped and the recursion test produced 0x1
instead of 0x110).

Locally verified on the user's Linux box:
  - Vulkan via the native offloader: recursion + skip-CH PASS;
    callable PASSes after spirv-opt cleanup (XFAILs from raw DXC SPIR-V
    as documented above).
  - D3D12 via Wine + vkd3d-proton + cross-compiled offloader.exe: all
    three PASS.
And on macOS 15 / metal-irconverter 3.1.1 via the native offloader:
  - All three PASS (recursion + skip-CH + callable).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@EmilioLaiso EmilioLaiso force-pushed the rt-pso-tests-dispatch branch from 1b0010a to c2f41d4 Compare June 24, 2026 07:47
@EmilioLaiso EmilioLaiso marked this pull request as ready for review June 24, 2026 07:47
EmilioLaiso pushed a commit to Traverse-Research/offload-test-suite that referenced this pull request Jun 24, 2026
…ldRay

Three small PSO RT tests stacked on llvm#1275, each isolating one shader-
observable closest-hit system value from llvm#1268's 👍 list. Same shape as
the prior batch in llvm#1277 — one .test file per behavior, single-purpose
shader, exact buffer comparison.

  - `closest-hit-barycentrics.test` — 3-lane dispatch, each lane fires
    at a clearly-interior point of the single triangle so the closest-
    hit shader reports a known `BuiltInTriangleIntersectionAttributes
    ::barycentrics` (u, v). Points are picked from the inside of the
    triangle to avoid the watertight-traversal edge-rule lottery you
    hit at edge midpoints / vertices (the first cut of this test used
    midpoint(v0, v1) and one lane silently missed on both backends).
  - `closest-hit-primitive-index.test` — three triangles tiled at x =
    -3, 0, +3 in a single BLAS. 3-lane dispatch fires straight down at
    each triangle's centroid; the closest-hit reports `PrimitiveIndex()`
    and must match the lane index 0..2.
  - `closest-hit-world-ray.test` — 2-lane dispatch with rays from
    different z heights (1.0 and 2.0). Closest-hit packs
    `WorldRayOrigin().z`, `WorldRayDirection().z`, and `RayTCurrent()`
    through the payload; raygen flattens the float3 into a 6-element
    Float32 buffer. Verifies the system values match the raygen-side
    `RayDesc` and that t is correctly computed by the traversal.

All three are `# REQUIRES: raytracing-pipeline` with `# XFAIL: Clang` —
`clang-dxc` doesn't yet lower `[shader(…)]` entry points. With the Metal
RT bring-up rebased on top, all three pass natively on Apple Silicon
and Metal is dropped from the XFAIL list.

Locally verified end-to-end on the user's Linux box: all three pass on
Vulkan via the native offloader, and on D3D12 via Wine + vkd3d-proton +
the cross-compiled `offloader.exe`, against an NVIDIA RTX 3060. And on
macOS 15 / metal-irconverter 3.1.1 via the native offloader: all three
PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EmilioLaiso pushed a commit to Traverse-Research/offload-test-suite that referenced this pull request Jun 24, 2026
Three tests stacked on llvm#1275 covering features that inline RT (RayQuery
in compute) physically can't express — they're only reachable through
a DispatchRays-driven RT pipeline.

  - `trace-ray-recursion.test` — closest-hit fires a secondary TraceRay
    from an above-the-triangle origin. First-level CH sees payload=0 →
    bumps to 0x1 → calls TraceRay. Second-level CH sees payload!=0 →
    writes 0x10. Unwinds: first-level OR's in 0x100. Final payload
    0x110 (272 decimal). `RayTracingPipelineConfig.MaxTraceRecursion-
    Depth: 2` so both TraceRay calls are within budget.
  - `ray-flag-skip-closest-hit.test` — two lanes fire identical rays at
    the same triangle. Lane 0 uses RAY_FLAG_NONE so CH runs and writes
    0xBEEF. Lane 1 uses RAY_FLAG_SKIP_CLOSEST_HIT_SHADER (PSO-only —
    inline RT has no equivalent) so CH is skipped and payload keeps its
    initial 0xAAAA. Output [0xBEEF, 0xAAAA].
  - `callable-shader.test` — two callable shaders writing distinct
    sentinels (0xAAAA / 0xBBBB). Each lane calls `CallShader(Idx, ...)`
    so the SBT callable region's per-record routing is exercised
    independently of the hit-group / miss routing already covered in
    llvm#1277. Callable shaders themselves don't exist in inline RT.

    This test stays `# XFAIL: Clang, Vulkan` because DXC's `-spirv`
    backend lists every callable's IncomingCallableDataKHR variable in
    every callable entry point's interface, violating VUID-Standalone-
    Spirv-IncomingCallableDataKHR-04706 and getting rejected by vk-
    CreateShaderModule. The framework's Vulkan SBT / callable path is
    correct — running `spirv-opt --remove-unused-interface-variables`
    on the DXC output cleans the SPIR-V and the test passes natively.
    Track upstream.

All three pass on Metal once the bring-up PR ahead of this commit sets
the raygen pipeline's `setMaxCallStackDepth(MaxTraceRecursionDepth)` so
nested TraceRay actually unwinds (with the default of 1, the second
TraceRay was silently dropped and the recursion test produced 0x1
instead of 0x110).

Locally verified on the user's Linux box:
  - Vulkan via the native offloader: recursion + skip-CH PASS;
    callable PASSes after spirv-opt cleanup (XFAILs from raw DXC SPIR-V
    as documented above).
  - D3D12 via Wine + vkd3d-proton + cross-compiled offloader.exe: all
    three PASS.
And on macOS 15 / metal-irconverter 3.1.1 via the native offloader:
  - All three PASS (recursion + skip-CH + callable).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four small tests stacked on top of llvm#1275, each isolating one
shader-observable PSO raytracing surface. They follow the same shape as
the inline-RT batch already in llvm#1271 / llvm#1272 / llvm#1274 / llvm#1276 — one .test
file per behavior, single-purpose shader, exact buffer comparison.

  - `dispatch-rays-index.test` — 4x1x1 dispatch, raygen writes
    `DispatchRaysIndex().x` into `Output[index]`. Confirms the
    dispatch grid plumbs through to the per-lane system value with no
    BLAS / TLAS / hit groups in play (RT-pipeline-only, no AS binding).
  - `dispatch-rays-dimensions.test` — 2x3x1 dispatch, raygen packs the
    constant `DispatchRaysDimensions()` into one uint per lane.
    Confirms every lane sees the host-side `{W, H, D}` even when only
    one dimension > 1.
  - `miss-shader-index.test` — two miss shaders writing distinct
    sentinels (0xAA / 0xBB). 2-lane dispatch picks `MissShaderIndex` 0
    and 1 respectively; rays start far enough from the geometry that
    every ray misses. Verifies the SBT miss region's per-record
    routing.
  - `ray-contribution-to-hit-group-index.test` — two hit groups with
    distinct closest-hit shaders (0xA1 / 0xB2). 2-lane dispatch picks
    `RayContributionToHitGroupIndex` 0 and 1, every ray hits the same
    triangle. Verifies the SBT hit-group region's per-record routing.

The first two have no AS / Miss / HitGroup in their pipeline at all —
just a raygen + a UAV — which exercises the minimum viable RT pipeline
shape (one raygen group, zero-sized miss / hit / callable SBT regions).
The latter two reuse the single-triangle BLAS/TLAS from
`raygen-roundtrip.test`.

All four tests are `# REQUIRES: raytracing-pipeline` with `# XFAIL: Clang`
— Clang (`clang-dxc`) doesn't yet lower `[shader("…")]` entry points to
either DXIL libraries or SPIR-V. With the Metal RT bring-up rebased on
top, all four pass natively on Apple Silicon and Metal is dropped from
the XFAIL list.

Locally verified end-to-end on the user's Linux box:
  - Vulkan via the native offloader against an NVIDIA RTX 3060:
    all four tests PASS.
  - D3D12 via Wine + vkd3d-proton + the cross-compiled offloader.exe
    on the same GPU: all four tests PASS.
And on macOS 15 / metal-irconverter 3.1.1:
  - Metal via the native offloader: all four tests PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@EmilioLaiso EmilioLaiso force-pushed the rt-pso-tests-dispatch branch from c2f41d4 to 365c67e Compare June 25, 2026 07:18
@EmilioLaiso EmilioLaiso merged commit 1f3b1e2 into llvm:main Jun 26, 2026
21 of 27 checks passed
EmilioLaiso pushed a commit to Traverse-Research/offload-test-suite that referenced this pull request Jun 26, 2026
…ldRay

Three small PSO RT tests stacked on llvm#1275, each isolating one shader-
observable closest-hit system value from llvm#1268's 👍 list. Same shape as
the prior batch in llvm#1277 — one .test file per behavior, single-purpose
shader, exact buffer comparison.

  - `closest-hit-barycentrics.test` — 3-lane dispatch, each lane fires
    at a clearly-interior point of the single triangle so the closest-
    hit shader reports a known `BuiltInTriangleIntersectionAttributes
    ::barycentrics` (u, v). Points are picked from the inside of the
    triangle to avoid the watertight-traversal edge-rule lottery you
    hit at edge midpoints / vertices (the first cut of this test used
    midpoint(v0, v1) and one lane silently missed on both backends).
  - `closest-hit-primitive-index.test` — three triangles tiled at x =
    -3, 0, +3 in a single BLAS. 3-lane dispatch fires straight down at
    each triangle's centroid; the closest-hit reports `PrimitiveIndex()`
    and must match the lane index 0..2.
  - `closest-hit-world-ray.test` — 2-lane dispatch with rays from
    different z heights (1.0 and 2.0). Closest-hit packs
    `WorldRayOrigin().z`, `WorldRayDirection().z`, and `RayTCurrent()`
    through the payload; raygen flattens the float3 into a 6-element
    Float32 buffer. Verifies the system values match the raygen-side
    `RayDesc` and that t is correctly computed by the traversal.

All three are `# REQUIRES: raytracing-pipeline` with `# XFAIL: Clang` —
`clang-dxc` doesn't yet lower `[shader(…)]` entry points. With the Metal
RT bring-up rebased on top, all three pass natively on Apple Silicon
and Metal is dropped from the XFAIL list.

Locally verified end-to-end on the user's Linux box: all three pass on
Vulkan via the native offloader, and on D3D12 via Wine + vkd3d-proton +
the cross-compiled `offloader.exe`, against an NVIDIA RTX 3060. And on
macOS 15 / metal-irconverter 3.1.1 via the native offloader: all three
PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants