-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block I/O tracing #196
Comments
I/O Handle in Otf2xxin OTF2 there is the concept of the I/O handle, which you need to create before you can write I/O operations and destroy afterwards, corresponding to the classical concept of opening and closing a file or a network connection. Block I/O however is "stateless" so there is no direct equivalent. Do we assign one handle to every block device, or one handle to every block? |
For the record:
|
To measure latency you have to match up the queue insert with queue remove. However the events do not have a simple id that allows us to match them up. I've now tested matching the events up in different ways: replay the kernel FIFO based on the events we haveThis assumes that the queue for every block device is a FIFO. To match the inserts to complete, simply replay the behaviour of the kernel FIFO in lo2s, based on the events and timestamps we have. Pro
Con
Match based on sector number.This basically assumes, that the sector that is written or read is unique and thus can be used as a key to match inserts with completes. Pro
Con
Match based on the address of the
|
I'm separating this from #194 so we can have a nice high level discussion there and get into the nitty gritty details of the implementation here.
event reading
by design there is one perf event stream per cpu that we read separately. This is problematic in this case, because one thread on one cpu can issue a block I/O request and a completely different thread (usually a kernel thread) that might be on a completely different cpu will receive the completion event.
A small BPF python hack shows, that separate issue/complete CPUs isn't an edge case : The majority of events have different CPUs where they were issued and completed, meaning that just discarding those events isn't an option.
So instead we probably need to cache the issue and complete events at measuring time and try to construct a coherent view based on the local event observations later
The text was updated successfully, but these errors were encountered: