[xbar,rtl] support fifo_depth>1 #28841

gautschimi · 2025-11-28T10:38:16Z

This commit adds the possibility to increase the fifo depth in the xbar to values > 1 to support multiple outstanding transactions.

Why this is beneficial:
The ibex instruction cache issues two 32b requests to the flash controller. Inside xbar_main a pipeline register is added to break the critical path to the flash. The pipeline register is added with a fifo of depth=1 for req and rsp data and effectively inserts a bubble after each request and response because the fifo is immediately full. Ibex and flash_ctrl can deal with up to 2 outstanding transactions.

The impact on performance is low because the instruction cache reads the critical word first and hides the additional latency that is inserted by the fifo with depth=1. Nonetheless, in phases with many cache misses, the performance can be improved at the price of an additional fifo entry.

This commit adds the possibility to increase the fifo depth in the xbar to values > 1 to support multiple outstanding transactions. Why this is beneficial: The ibex instruction cache issues two 32b requests to the flash controller. Inside xbar_main a pipeline register is added to break the critical path to the flash. The pipeline register is added with a fifo of depth=1 for req and rsp data and effectively inserts a bubble after each request and response because the fifo is immediately full. Ibex and flash_ctrl can deal with up to 2 outstanding transactions. The impact on performance is low because the instruction cache reads the critical word first and hides the additional latency that is inserted by the fifo with depth=1. Nonetheless, in phases with many cache misses, the performance can be improved at the price of an additional fifo entry. Signed-off-by: Michael Gautschi <[email protected]>

gautschimi · 2025-11-28T11:15:02Z

master:

Requests every 2nd cycle, 1 bubble after each request and response.

fifo_depth=2

Requests are granted in two cycles in a row to access a full cacheline. No more bubbles in between cacheline requests and responses

vogelpi

Thanks, this looks good and it seems like a nice improvement!

Two questions:

Could you observe any performance improvement e.g. for CoreMark?
Shall we also enable this for Darjeeling (executes from a big SRAM but also has the I-Cache present)?

vogelpi · 2025-11-28T15:37:37Z

util/tlgen/item.py

    rsp_fifo_pass = True

+    # FIFO depth option. default is 1
+    # If pipeline is false or req/rsp_fifo_pass are true, this field has no meaning


Why does the depth have no meaning if the either of the pass options are true? I guess it's just a limitation of the current implementation, right?

If req/rsp_fifo_pass is set, the requests can pass through immediately. I guess in this case we could also make the depth variable. on the other hand, the requests+responses can be buffered by the receiving side if pass-through is set.

if pipeline is false, the depth is set to 0

gautschimi · 2025-12-02T08:29:00Z

Thanks, this looks good and it seems like a nice improvement!

Two questions:
* Could you observe any performance improvement e.g. for CoreMark?

* Shall we also enable this for Darjeeling (executes from a big SRAM but also has the I-Cache present)?

I checked on coremark, but the performance benefit is very small (<1%) which surprised me a bit. Looking into the cache in more detail explains it a bit. The cache fetches the critical instruction first, and immediately passes it to the core. Hence, there is no benefit for the miss penalty. With the changes, the other instruction arrives immediately after the first one. Before it arrived 2 cycles later which can cause 1 additional stall. But the coremark benchmark is rather small, and there are not so many cache misses that it makes a difference.
As far as I can see, there is no pipeline in the xbar. however there is a pipeline in the core wrapper:
https://github.com/lowRISC/opentitan/blob/master/hw/ip_templates/rv_core_ibex/rtl/rv_core_ibex.sv.tpl#L158-L163

earlgrey sets this to 0 while darjeeling sets it to 1. (which also sets the fifo depth to 2, similar to this PR)

I think these are the main differences:

Darjeeling: The pipeline register in the core is enabled for instruction + data requests
Darjeeling: All requests (flash, sram, rom, peripherals) are pipelined
Earlgrey: The pipeline is only added for requests to the flash (instruction+data)

I'm not sure which is better, I think both can be justified.

gautschimi force-pushed the xbar_fifo_depth_support branch from 3bd1706 to 2972f23 Compare November 28, 2025 10:39

gautschimi requested review from Razer6, andreaskurth, meisnere, rswarbrick and vogelpi November 28, 2025 10:42

gautschimi force-pushed the xbar_fifo_depth_support branch from 2972f23 to 40202d2 Compare November 28, 2025 10:57

gautschimi marked this pull request as ready for review November 28, 2025 11:19

gautschimi requested a review from msfschaffner as a code owner November 28, 2025 11:19

vogelpi reviewed Nov 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[xbar,rtl] support fifo_depth>1 #28841

[xbar,rtl] support fifo_depth>1 #28841

Uh oh!

gautschimi commented Nov 28, 2025

Uh oh!

gautschimi commented Nov 28, 2025

Uh oh!

vogelpi left a comment

Uh oh!

vogelpi Nov 28, 2025

Uh oh!

gautschimi Dec 2, 2025

Uh oh!

gautschimi commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[xbar,rtl] support fifo_depth>1 #28841

Are you sure you want to change the base?

[xbar,rtl] support fifo_depth>1 #28841

Uh oh!

Conversation

gautschimi commented Nov 28, 2025

Uh oh!

gautschimi commented Nov 28, 2025

Uh oh!

vogelpi left a comment

Choose a reason for hiding this comment

Uh oh!

vogelpi Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gautschimi Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

gautschimi commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants