Asynchronous I/O support & a new Data rearranger#685
Merged
jayeshkrishna merged 194 commits intomasterfrom May 7, 2026
Merged
Conversation
Adding a multithreaded queue and a thread pool. Also adding util
functions to use the new thread pool for asynchronous tasks.
These new capabilities are not used by the library right now
(although some interfaces have been modified/added to support
usage in future)
Adding a thread pool for asynchronously processing async_op tasks.
* Allows queuing async tasks of pio_async_op_t type
* Leverages the Multi-threaded queue, PIO_mtq, to keep track
of async ops
* The instances of the thread pool is managed via a thread pool
manager.
* Currently the number of async threads is set to 1
* Also adding a simple unit test for the thread pool.
Adding a multi-threaded queue that will be eventually used for
queueing asynchronous I/O tasks.
* Using C++11 threading framework and synchronization primitives
to implement the multi-threaded queue.
- Multiple threads can access the queue without any explicit
synchronization (all required synchronization is handled by
the queue)
- Functions to enqueue/dequeue data in queue
- Function to signal threads waiting on dequeuing data from
the queue is also available (SIG_STOP : stop waiting on queue,
SIG_COMPLETE : complete dequeing existing items and exit when
when the queue is empty)
- Function to provide size (instantaneous) of the queue
Adding configure option PIO_USE_ASYNC_WR_THREAD to enable/disable
asynchronous write threads
Implementing the necessary framework to add asynchronous
operations to a file or I/O system.
Also adding unit tests for multithreaded queue & thread pool
Moving all C unit test sources from C to CXX (foo.c -> foo.cpp). Tyepcasting returns from malloc()/calloc() appropriately (reqd for C++) & fixing the fillvalue arg to PIOc_write_darray_multi() in the tests
Moving internal types from pio.h, which is exposed to the user, to pio_types.hpp, that is internal to the library
Replacing custom implementation of singly linked lists with STL maps for storing global collection of files/iosystems/iodescs. Moving to maps should allow for fast searches in the collection without resorting to custom hacks (e.g. caching the pointer to the last node).
Adding a missing free for file struct allocated in the unit test
Explicitly deleting state files in the global file list after all I/O systems are finalized.
Adding a GPTL wrapper class to automatically stop the timer during func return.
Adding util functions to perform custom gather/scatter/alltoall. All of the util functions include handshake support. The alltoall functions also include flow control support. The alltoall funcs were refactored from the old implementation of altoall (pio_swapm) The gather functions support receiving different types from the sending procs and the scatter functions support sending different types to the receiving procs.
Adding some debug util functions to print 1d vectors.
Adding a new rearranger, PIO_REARR_CONTIG, that performs data rearrangement in two steps, 1) Data aggregation : Data aggregation from compute processes to I/O (aggregating) processes 2) Data rearrangement : Data rearrangement between I/O (aggregating) processes such that each I/O process has a contiguous data chunk
Including PIO_REARR_CONTIG rearranger in the C and Fortran libraries
Adding unit tests for the new rearranger, PIO_REARR_CONTIG, and the new utils added for gather/scatter/alltoall
Adding a simple wrapper class for SPIO timers to start and stop timers
Using GPTL and SPIO timer wrapper classes to simplify stopping of timers (on function exits). This change simplifies the code by removing a lot of explicit stops of timers
Code reformatting (starting braces on same line, no spaces btw keyword and starting of braces, 2 space tab) the PIO_initdecomp_impl() function
Code reformatting (starting braces on same line, no spaces btw keyword and starting of braces, 2 space tab) the malloc_iodesc function
Moving memory allocation of internal arrays (to store map/dimlens) in the I/O descriptor to the malloc_iodesc() function
Adding the logic to initdecomp function to support 0/1 based decomposition maps. Separating the user interfaces (and simplifying them) to thin wrapper functions that eventually call the initdecomp() function to create I/O decompositions for 0 & 1 based decomps Also using C++ algos to copy the map (and transform it as needed to support 0/1 based maps for the different rearrangers) and dimension lengths.
Using the GPTL and SPIO Ltimer wrapper classes to simplify timers (stop on return from function) in the write darray functions
Code reformatting (starting braces on same line, no spaces btw keyword and starting of braces, 2 space tab) the write darray functions
Adding functions to expose I/O decomposition and rearrangement information in the contig rearranger. * Check if rearranger is initialized * Get I/O decomposition map info * Get rearranger info (buffer size, range)
Using GPTL and SPIO ltimer wrappers, to simplify stopping the timer on function return, in the read darray function
Code reformatting (starting braces on same line, no spaces btw keyword and starting of braces, 2 space tab) the read darray function
Adding support for writing data and reading it back using the PIO_REARR_CONTIG rearranger.
Adding support for fillvalues in the new contig rearranger. This commit includes debug statements and needs more fixes to support all fillvalue tests.
Fixing issue with sending/recving local buffer in gatherw
Reordering the order of the I/O rearranger types in the testing framework so that PIO_REARR_CONTIG is tested first
Adding some PnetCDF tests to write 1d, 2d, 3d vars
Making sure that we chunk/partition the contiguous offset range to contiguous dim ranges before writing out multi dim vars
Since the aggregator buffer size is in bytes (not elements), fix the stride for MPI hvector accordingly while dispersing data read for multiple variables (nvars > 1, io2comp)
Remove obsolete function (empty function) that was used to close multiple soft closed files. (spio_close_soft_closed_file() is now used to close these files)
Fixing warning in bget where "char *" was used to point to const strings. (Warning : "ISO C++ forbids converting a string constant to 'char*' ") The code was refactored to avoid these assignments
Define bufsize in bget to size_t. This avoids warnings due to comparison of bufsize (which was defined as long) to size of objects (which would be size_t) Also refactoring the buffer report function to pass in args with the correct type (instead of assuming longs pass in args of type bufsize as needed) and use the appropriate casts (and inits)
Add missing free() for pointer to mutex when file create/open fails
Ensure that we always check the error globally (across all procs in the comm) with create/open calls, instead of using the error handler set by the user. This prevents dangling internal structs related to files that fail to open (or create)
Removing memset() of the dummy file_desc_t struct in mvcache test since it now contains C++ data structures.
Remove invalid file from the global file list when check for unlimited dimensions fail.
Adding support for writing out fillvalues with the HDF5 iotypes
With a new data rearranger and new I/O types tests can potentially take much longer to run. Increasing the default timeout for tests to 1800s
The spio_change_def() function handled both enddef and redef calls and was getting harder to read/maintain. Splitting the spio_change_def() to separate functions for handling enddef and redef calls. Also move this code back to PIOc_<redef/enddef>_impl() functions and getting rid of the common spio_change_def() function Also refactoring the code for handling enddef() calls to make it easier to read/maintain. The code for handling redef() call was also refactored as needed.
Adding explicit check for "define mdoe "in redef/enddef calls for NetCDF, HDF5 and ADIOS I/O types. Redef calls in "define mode" only fail for PnetCDF lib calls
Handle cases where close() is called without an enddef() for HDF5 I/O types (by explicitly calling enddef() before a close())
In I/O system tests avoid opening the file in write mode when checking the contents
Adding support for PIO_copy_att() for HDF5 I/O types. * The spio_hdf5_enddef() function was refactored to move the sync of definitions to a separate function * A new function, spio_hdf5_copy_att(), was added to support copying HDF5 attributes between files * A minor fix in default case for PIOc_copy_att() impl : Using the user supplied file ids instead of using file->fh
Increasing timeout for HDF5 test from 1m to 3m
Create an AUTHORS.md file and include list of major contributors
Adding a simple AGENTS.md for the library
Add ifdefs (HAVE_PNETCDF) around pnetcdf async util functions
Add pnetcdf tests only if pnetcdf lib is found/available
amametjanov
approved these changes
May 6, 2026
jayeshkrishna
added a commit
that referenced
this pull request
May 6, 2026
This PR includes the following major changes, * Asynchronous I/O support (for HDF5 iotypes) * New Data rearranger (PIO_REARR_CONTIG) * New I/O decomposition logger * Moving to C++ data structures (lists etc) * Major code refactoring (source/headers are refactored and moved) * Bug fixes from maint/v1.9 branch * Misc enhancements - datatype converter, capture stack trace, GPTL timers, more tests, C++ unit testing framework * Misc bug fixes Fixes #682 (Also includes fixes in PR #676, PR #674, PR #672) * jayeshkrishna/async_thread_support: (194 commits) Disable pnetcdf tests if its not available ifdef pnetcdf calls in async utils Adding a simple AGENTS.md Add an AUTHORS.md file Increasing timelimit for hdf5 test Adding support for copying HDF5 atts Avoid wr mode when checking file in iosystem tests Handle hdf5 close without enddef Explicit check for define mode for non-PnetCDF Refactor spio_change_def() util function Increasing default timeout for tests to 30m Add fillval support for HDF5 Rm invalid file if unlim dim chk fails Avoid memset of file_desc since it has C++ ds Global check on fail in create/open Missing free for mtx when open/create fails Fix warning in bget : set bufsize to size_t Fix warning in bget : missing const in char* Rm obsolete func to close soft closed files Skip modifying string at idx for ADIOS ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR includes the following major changes,
GPTL timers, more tests, C++ unit testing framework
Fixes #682
(Also includes fixes in PR #676, PR #674, PR #672)