Include new phases for extended mode. #93

JulianKunkel · 2025-07-09T12:18:51Z

Additionally provides basic support for GPUDirect.

Basic support for GPUDirect.

seattleplus · 2025-07-09T18:47:53Z

What is the ISC discussion? Please elaborate in the PR.

seattleplus

I think we should be cutting extended mode down to just those phases that are in line to be adopted by the full benchmark...which means cutting most of the phases except for random read 4kb at the moment...

Trying to turn io500 into a tool for customers for procurements doesn't fit into our mission. I think procurements are operating on a different timeline than io500. io500 is operating over long term trends, choosing phases that have long term and lasting implications for storage systems. procurements come up with a variety of short term requirements for each and every procurement, requiring changes/updates as to whatever the latest app might be...requiring potentially a more flexible and fast changing benchmark

adilger · 2025-07-11T20:48:11Z

Dean,
I don't think that the added concurrent mode is specific to a particular RFC, but potentially interesting to a wide variety of users that are not running a single grand challenge application on their whole cluster, but rather many smaller applications that have uncoordinated and overlapping IO patterns. The phase is relatively generic (though it still needs a way to specify the ratio of read, write, and metadata ranks), so I don't think it will need ongoing changes for each procurement as you suggest).

Also, this PR includes the start of the find-hard phase. On the one hand I wonder if we may still need to make it "harder" in some way, but on the other hand actually listing all the files in the directory into output files may be a challenge as well, and we need to see how this plays out on real systems.

I'm OK with the "extended mode" being more experimental, and some of the experiments graduate to be scored IO500 phases, and others need rework before that happens (as was done with ior-rnd4k-read-easy). If we don't have anything in the extended mode for users to test, then we won't make much progress in advancing new phases.

gflofst · 2025-07-11T21:31:24Z

I didn't see this as RFP specific as much as extending IO500 to be a better general benchmarking tool beyond just the competition. We often need more realistic workload simulation capabilities for research support. This, at least on the surface, seems to be a step in that direction. I'd have to look at some research scenarios and see how it helps. One example would be bandwidth reservations or QoS contracts. Both of these have had weak testing because the benchmarks don't support these kinds of usage patterns. If it proves useful for research papers to represent particular kinds of interference patterns, then it may be a good candidate for future inclusion in the scoring. I would not make this functioning be a requirement for any list release and would require it is isolated sufficiently that it does not interfere. I believe both of those conditions are currently met.

seattleplus · 2025-07-11T22:29:16Z

Let's at least align on what we see as the vision for the 'extended mode' and how users can use it. Right now it is a collection of more or less random phases that have not been reviewed by the steering committee (at least not the current one) and lacks a easy to use method for a) understand what each extended phase is and b) how/when they should be used and what type of info they provide. As far as I can tell, they are only useful to those people that wrote them, since they are the only ones that understand their usefulness.

This patch just continues the current state of adding in a more or less random set of I/O workloads that no one really understands what is for and where it comes from.

So as it stands I'm against using extended mode as a 'dumping ground' for people's I/O workload whim's without any review. as it is not in line with the overall mission of the io500 due to the lack of consensus on the usefulness of the patterns, and rather using io500 as a convenient platform on which to externalize narrow use workloads without having to write another framework.

As a committee we should review a proposal on what find-hard entails...do we have a proposal document?

Obihoernchen · 2025-07-12T21:38:58Z

I am not part of the committee but maybe the following will help you in some way:

Even if it is not the aim of io500 to be used for RFPs, I am often very happy when there are RFPs that make use of io500 (including extended mode tests). Unfortunately, it is often the case that ior, mdtest or mdworkbench are requested in RFPs with “random” self-selected parameters that sometimes don't make too much sense and may not even measure what the RFP creator thinks. Many sites also like to rely on existing defined tests and are hesitant to come up with their own tests. Especially smaller sites that do not have dedicated storage experts are very happy that io500 (also extended) exists. However, the standard mode tests are not enough for some sites, so they occasionally also use some extended tests. I therefore think it's pretty good that these experimental tests are available in extended mode.

The concurrent test itself is also a pretty good stability test for a storage system. However, I'm not quite sure how useful it is to calculate the SCOREX value in extended mode. Since the tests can sometimes change considerably and further tests can be added etc., this value should be compared with caution. It should possibly be clear in the output that this is an experimental score whose definition can change depending on the io500 release. This is currently not really visible to the normal user.

The description of the individual phases in https://io500.org/about could sometimes be a little more detailed (the new 4k rnd read phase is still missing there, for example). I think every phase (write as well as read, stat etc.) should be described in detail there.
And certainly the extended phases should also be described in detail somewhere.

adilger · 2025-07-13T02:24:27Z

Markus, we wrote up a more thorough description of the existing phases as part of the ior-rnd4k-read-easy addition. It would make sense to add those descriptions to the "about" page, thanks for the suggestion.

Dean, I agree it would make sense for Julian to write up a description of what the new phase concurrent is adding.

Julian, while I did ask to get the bug fixes squashed into the original patches that add them, I think it is too much to squash everything into a single patch. IMHO, it would make sense to keep the find-hard in one patch, concurrent in a second patch, the GPU IO in another patch, and any unrelated cleanups in a different patch. That makes it much easier to see what is going on in each patch.

seattleplus · 2025-07-14T16:11:46Z

Thanks Markus for your detailed thoughts, really insightful. I also like the fact that io500 is now sometimes used for RFPs. You call out 2 key aspects of the official IO500 that make them useful for such situations:

The phases are well understood. Generally this is true for official tests, but we can do a better job there, which is why as Andreas says we flushed them out in our latest random read proposal .
The phases have a rigorous review/change process so org's can point to specific versions with specific capabilities.

As you rightly call out, neither of these items are true for the extended phases, and so I really don't see how they are useful to anyone today other than the few users that helped create the phases themselves.

So even if we did open up the extended phases to being something more than just benchmarks on track for io500 inclusion, then we would need a lot more rigor around the process before org's would rely on it for something as important as a RFP.

So let's document and align on the find patch before it is included (sounds like it is in process).

And let's align on a process (similarly documented as with the official phases) of how we would add/remove extended phases, and hence the purpose of the extended phases, before any patches are considered for acceptance.

scrusanhrt · 2025-07-22T22:39:52Z

I am also not a part of the committee, so I'm only adding my thoughts here. If there's a mailing list where this is being discussed that is better, let me know, and I will join. Note, I submitted a result that scored very high in the ISC25 10 node challenge list, so was working a lot w/ the io500 lately, and this topic has been on my mind for awhile.

I'm very much in agreement with Markus and a few others on most of what's already been said, so I won't rehash it too much, but I'd like to add the following:

I think having a mixed use / generalized test that is a part of the extended scope that has committee (or maybe community?) consensus would help demonstrate the general strengths/weaknesses of storage platforms beyond the current set of tests.

Given that a lot of these tests often test individual characteristics of a storage system, in somewhat of a vacuum, I don't know if they always highlight steady-state production use. I don't have the numbers on this, but I suspect a large majority of production systems (that aren't doing DOE checkpoint-restart and/or defensive I/O) are (simultaneously) majority sequential reads, a smattering of random read IOPs (open, read, close small files), stat() / getattr() calls, and some small amount of writes. Maybe a system dedicated to AI/ML would be more dominated by smaller random I/O driven via mmap/pytorch/etc.

The reason I think a more generalized "a little of everything" (minus very intense sequential write throughput, or random write iops) extended test would be useful is because the architecture of storage systems can determine how well mixed use I/O is handled.

For example: it's possible to co-locate Lustre OSTs and MDTs on the same exact server. This means that server is sharing:

NICs
CPU cycles for MDS or OSS threads
memory

If you're doing (for example) 1M getattrs in such a setup from your clients, and a very heavy read throughput workload starts running against the system, your metadata IOPs drop substantially. I had an early Lustre system from a few years ago with this architecture, and it was very reproducible.

I won't get into how a prototypical concurrent phase is constructed, and what to do longterm since that's all above my pay grade, but I'm +1 for at least adding it to the extended tests.

-Steve

Include new phases for extended mode.

9a45c18

Basic support for GPUDirect.

JulianKunkel requested a review from a team as a code owner July 9, 2025 12:18

JulianKunkel mentioned this pull request Jul 9, 2025

Phase concurrent #56

Closed

Find hard adjusted according to ISC discussion.

0bcfb99

gflofst approved these changes Jul 9, 2025

View reviewed changes

seattleplus reviewed Jul 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Include new phases for extended mode. #93

Include new phases for extended mode. #93

Uh oh!

JulianKunkel commented Jul 9, 2025 •

edited

Loading

Uh oh!

seattleplus commented Jul 9, 2025

Uh oh!

seattleplus left a comment

Uh oh!

adilger commented Jul 11, 2025

Uh oh!

gflofst commented Jul 11, 2025

Uh oh!

seattleplus commented Jul 11, 2025

Uh oh!

Obihoernchen commented Jul 12, 2025

Uh oh!

adilger commented Jul 13, 2025

Uh oh!

seattleplus commented Jul 14, 2025

Uh oh!

scrusanhrt commented Jul 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Include new phases for extended mode. #93

Are you sure you want to change the base?

Include new phases for extended mode. #93

Uh oh!

Conversation

JulianKunkel commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seattleplus commented Jul 9, 2025

Uh oh!

seattleplus left a comment

Choose a reason for hiding this comment

Uh oh!

adilger commented Jul 11, 2025

Uh oh!

gflofst commented Jul 11, 2025

Uh oh!

seattleplus commented Jul 11, 2025

Uh oh!

Obihoernchen commented Jul 12, 2025

Uh oh!

adilger commented Jul 13, 2025

Uh oh!

seattleplus commented Jul 14, 2025

Uh oh!

scrusanhrt commented Jul 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

JulianKunkel commented Jul 9, 2025 •

edited

Loading