-
Notifications
You must be signed in to change notification settings - Fork 8
Description
I'm getting the following error when I run kmtricks with 276 samples:
$ kmtricks pipeline --kmer-size 111 --hard-min 0 --share-min 1 --soft-min 2 --recurrence-min 3 --file data/group/xjin/r.proc.kmtricks_input.txt --run-dir <DIR> --mode kmer:count:text --threads 24
[2024-03-22 09:26:07.016] [info] Run with Kmer<128> - uint64_t[4] implementation
[2024-03-22 09:26:07.069] [info] Compute configuration...
[2024-03-22 09:26:07.069] [info] 276 samples found (552 read files).
[2024-03-22 09:26:47.828] [info] Use 156 partitions.
[2024-03-22 09:26:48.117] [info] Compute minimizer repartition...
Compute SuperK [> ] [00m:00s]
Count partitions [> ] [00:00s]
terminate called after throwing an instance of 'std::runtime_error'
what(): Unable to open <DIR>/superkmers/xjin_AB_P0R2c/skp.90
terminate called recursively
[2024-03-22 09:30:47.682] [error] Killed after receive Aborted:SIGABRT(6) signal. Demangled backtrace dumped at ./kmtricks_backtrace.log. If the problem persists, please open an issue with the return of 'kmtricks infos' and the content of ./kmtricks_backtrace.log
./kmtricks_backtrace.log:
Backtrace:
1 0x00007f8d88e41090 (null) + 140245863764112
2 0x00007f8d88e4100b gsignal + 203
3 0x00007f8d88e20859 abort + 299
4 0x00007f8d89213f00 __gnu_cxx::__verbose_terminate_handler() + 192
5 0x00007f8d8921243c (null) + 140245867766844
6 0x00007f8d8921248e (null) + 140245867766926
7 0x000055a817e6c705 (null) + 94180443866885
8 0x00007f8d88e448a7 (null) + 140245863778471
9 0x00007f8d88e44a60 on_exit + 0
10 0x000055a817f2e98a (null) + 94180444662154
11 0x00007f8d88e41090 (null) + 140245863764112
12 0x00007f8d88edb23f clock_nanosleep + 223
13 0x00007f8d88ee0ec7 nanosleep + 23
14 0x000055a817f3f95b (null) + 94180444731739
15 0x000055a8180b14d0 (null) + 94180446246096
16 0x000055a8180b2165 (null) + 94180446249317
17 0x000055a8180b9443 (null) + 94180446278723
18 0x000055a817e52afb main + 1419
19 0x00007f8d88e22083 __libc_start_main + 243
20 0x000055a817e555e5 (null) + 94180443772389
infos:
$ kmtricks infos
kmtricks v1.4.0
- HOST -
build host: Linux-6.1.3
run host: Linux 4.18.0-513.11.1.el8_9.x86_64
- BUILD -
c compiler: GNU 11.2.0
cxx compiler: GNU 11.2.0
conda: ON
static: OFF
native: OFF
modules: ON
socks: ON
howde: ON
dev: OFF
kmer: 32,64,96,128,160,192,224,256
max_c: 4294967295
- GIT SHA1 / VERSION -
kmtricks: 7dc4d18
sdsl: c32874c
bcli: 3e4f493
fmt: 0544a227
kff: 97d135e
lz4: 4de56b3
spdlog: v1.2.1-1811-g5b4c4f3f
xxhash: 6853ddc
gtest: release-1.8.0-2774-g96f4ce02
croaring: v0.3.3-17-g2d5c927
robin-hood-hasing: 24b3f50
turbop: 4ab9f5b
cfrcat: 2f9da97
indicators: v1.9-36-gcdcff01
Contact: [email protected]
I don't get the same error when I use a subset of 8 samples (instead of 276), nor when I use just 12 threads (instead of 24).
This seems pretty clearly related to Issue #15 , where kmtricks is opening too many files.
Indeed, lsof
confirms this, with the crash occuring just as the number of open files ramps up. When I use 12 threads, <1000 files are opened and it doesn't crash.
While using fewer threads works, I'd love a solution that maintains the high parallelization during other steps. Can you suggest a way to run the kmtricks pipeline so that the superkmers computation step doesn't open too many files, but I get as much parallelization as possible?
Thanks for your help, and for building a valuable tool!