[RoPE] update input settings #345

BoyuanFeng · 2025-08-20T22:57:14Z

This PR updates the input shapes. In llm, head dimension D is usually small (e.g., 128). Batch size B and sequence length S keep changing.

Shape: [B, H, S, D]

q: [B, 32, S, 128]
kv: [B, 32, S, 128]

Command: python3 run.py --op rope --metrics cuda_time

xuzhao9 · 2025-08-21T16:47:27Z

The fp8_gemm CI failure should be unrelated. It seems to be inductor issue.

update input settings

b01f9fa

BoyuanFeng marked this pull request as draft August 20, 2025 22:57

BoyuanFeng had a problem deploying to docker-s3-upload August 20, 2025 22:57 — with GitHub Actions Error

meta-cla bot added the cla signed label Aug 20, 2025

update table name

0d603a6

BoyuanFeng had a problem deploying to docker-s3-upload August 20, 2025 22:59 — with GitHub Actions Failure

BoyuanFeng temporarily deployed to docker-s3-upload August 20, 2025 22:59 — with GitHub Actions Inactive

add repro

151ea25

BoyuanFeng had a problem deploying to docker-s3-upload August 20, 2025 23:34 — with GitHub Actions Failure

BoyuanFeng had a problem deploying to docker-s3-upload August 20, 2025 23:35 — with GitHub Actions Failure

BoyuanFeng mentioned this pull request Aug 20, 2025

different results between --cudagraph and do_bench_cudagraph #346

Closed

Provide feedback