fix: Make attention microbenchmark correctly use page table #1976

bkryu · 2025-10-24T17:26:35Z

📌 Description

Current microbenchmark code does not provides instantiated block_tables to all backends. The omission had no impact to correctness or perf because page tables are instantiated linearly when not provided, but will manifest as mismatches if it is shuffled.

The current PR simply calls the FlashInfer APIs in their intended way.

No changes to library code

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

Refactor
- Enhanced consistency in attention computation by aligning page-table parameter handling across different inference backend implementations for improved paged key-value cache operations.

coderabbitai · 2025-10-24T17:26:46Z

Walkthrough

The pull request extends paged attention wrappers to accept and propagate block table parameters throughout the execution pipeline. BatchDecodeWithPagedKVCacheWrapper and BatchPrefillWithPagedKVCacheWrapper now receive block_tables in their planning stages, while BatchMLAPagedAttentionWrapper passes page_table during kernel execution, unifying page table handling across attention backends.

Changes

Cohort / File(s)	Summary
Paged Attention Wrapper Parameter Propagation `benchmarks/routines/attention.py`	Added `block_tables` parameter to `BatchDecodeWithPagedKVCacheWrapper.plan()` and `BatchPrefillWithPagedKVCacheWrapper.plan()` calls; extended `BatchMLAPagedAttentionWrapper.run()` invocation to pass `page_table=block_tables` for kernel execution

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested reviewers

yzh119
nvmbreughe
PerkzZheng
aleozlx
joker-eph

Poem

🐰 Hop, hop, through the pages we go,
Block tables flow where the attention winds blow,
Paged kernels aligned, from decode to prefill,
A rabbit's delight—unified thrills!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title Check	❓ Inconclusive	The title "fix: Make attention microbenchmark correctly use page table in" is clearly related to the main changeset, which adds block_tables parameters to multiple wrapper methods to fix the attention microbenchmark's page table usage. The intent is understandable from context, and the title does identify the core fix being applied. However, the title appears incomplete—it ends awkwardly with "in" without finishing the thought, making it grammatically incorrect and ambiguous as written. While the general topic is evident, the incomplete phrasing reduces clarity about the full scope and intention.	Complete the title to make it grammatically correct and fully convey the intent. Consider revising it to something like "fix: Make attention microbenchmark correctly use page table in all backends" or "fix: Pass block_tables to attention microbenchmark wrappers" to clearly communicate what is being fixed and why, without ambiguity or incomplete phrasing.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	The pull request description follows the provided template with all major sections present and appropriately completed. The 📌 Description section clearly explains what the PR does and why it was needed, noting that the microbenchmark was not providing instantiated block_tables and that this PR corrects the API usage without modifying library code. The 🔍 Related Issues section is present (though empty, which is acceptable when no related issues exist). The 🚀 Pull Request Checklist is complete with all pre-commit checks and test items marked as completed, and the optional Reviewer Notes section is included. The description provides sufficient context for understanding the change.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Make microbench use correct page table

2b5e19a

bkryu marked this pull request as ready for review October 24, 2025 17:26

bkryu requested review from Anerudhan, cyx-6 and nvmbreughe as code owners October 24, 2025 17:26

bkryu changed the title ~~fix: Make attention microbenchmark correctly use page table in~~ fix: Make attention microbenchmark correctly use page table Oct 24, 2025

yzh119 approved these changes Oct 24, 2025

View reviewed changes

yzh119 merged commit a565294 into flashinfer-ai:main Oct 24, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: Make attention microbenchmark correctly use page table #1976

fix: Make attention microbenchmark correctly use page table #1976

Uh oh!

bkryu commented Oct 24, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: Make attention microbenchmark correctly use page table #1976

fix: Make attention microbenchmark correctly use page table #1976

Uh oh!

Conversation

bkryu commented Oct 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bkryu commented Oct 24, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 24, 2025 •

edited

Loading