Skip to content

Fix leiden hanging when insufficient shared memory is available. #989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@

# BUG FIXES

* `cluster/leiden`: Fix an issue where insufficient shared memory (size of `/dev/shm`) causes the processing to hang.

* `utils/subset_vars`: Convert .var column used for subsetting of dtype "boolean" to dtype "bool" when it doesn't contain NaN values (PR #959).

* `resources_test_scripts/annotation_test_data.sh`: Add a layer to the annotation reference dataset with log normalized counts (PR #960).
Expand Down
16 changes: 15 additions & 1 deletion src/cluster/leiden/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
"uns_name": "leiden",
"output_compression": "gzip",
}
meta = {"cpus": 8, "resources_dir": "."}
meta = {"cpus": 8, "resources_dir": "src/utils"}
## VIASH END

sys.path.append(meta["resources_dir"])
Expand All @@ -55,6 +55,13 @@
_shared_logger_name = "leiden"


# Function to check available space in /dev/shm
def get_available_shared_memory():
shm_path = "/dev/shm"
shm_stats = os.statvfs(shm_path)
return shm_stats.f_bsize * shm_stats.f_bavail


class SharedNumpyMatrix:
def __init__(
self,
Expand All @@ -70,6 +77,13 @@ def __init__(
def from_numpy(
cls, memory_manager: managers.SharedMemoryManager, array: npt.ArrayLike
):
available_shared_memory = get_available_shared_memory()
n_bytes_required = array.nbytes
if available_shared_memory < n_bytes_required:
raise ValueError(
"Not enough shared memory (/dev/shm) is available to load the data. "
f"Required amount: {n_bytes_required}, available: {available_shared_memory}."
)
shm = memory_manager.SharedMemory(size=array.nbytes)
array_in_shared_memory = np.ndarray(
array.shape, dtype=array.dtype, buffer=shm.buf
Expand Down