Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with HyperQueue #1193

Open
edan-bainglass opened this issue Mar 1, 2025 · 2 comments
Open

Issue with HyperQueue #1193

edan-bainglass opened this issue Mar 1, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@edan-bainglass
Copy link
Member

edan-bainglass commented Mar 1, 2025

System: Windows 11, running in WSL2 Ubuntu 24.04.2 LTS environment

Tried to pin an aiidalab-qe container and got stuck in a loop outputting the following:

HyperQueue version: v0.19.0

You can also re-run HyperQueue server (and its workers) with the `RUST_LOG=hq=debug,tako=debug`
environment variable, and attach the logs to the issue, to provide us more information.

thread 'main' panicked at crates/tako/src/internal/common/resources/descriptor.rs:112:9:
assertion failed: size > 0
stack backtrace:
   0:     0x55e51f35fbf9 - std::backtrace_rs::backtrace::libunwind::trace::hbee8a7973eeb6c93
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
   1:     0x55e51f35fbf9 - std::backtrace_rs::backtrace::trace_unsynchronized::hc8ac75eea3aa6899
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x55e51f35fbf9 - std::sys_common::backtrace::_print_fmt::hc7f3e3b5298b1083
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:68:5
   3:     0x55e51f35fbf9 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hbb235daedd7c6190
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x55e51f0aab60 - core::fmt::rt::Argument::fmt::h76c38a80d925a410
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/rt.rs:142:9
   5:     0x55e51f0aab60 - core::fmt::write::h3ed6aeaa977c8e45
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/mod.rs:1120:17
   6:     0x55e51f32887e - std::io::Write::write_fmt::h78b18af5775fedb5
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/io/mod.rs:1810:15
   7:     0x55e51f361c2e - std::sys_common::backtrace::_print::h5d645a07e0fcfdbb
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x55e51f361c2e - std::sys_common::backtrace::print::h85035a511aafe7a8
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x55e51f3614d7 - std::panicking::default_hook::{{closure}}::hcce8cea212785a25
  10:     0x55e51f3610bf - std::panicking::default_hook::hf5fcb0f213fe709a
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:292:9
  11:     0x55e51f00eeeb - call<(&core::panic::panic_info::PanicInfo), (dyn core::ops::function::Fn<(&core::panic::panic_info::PanicInfo), Output=()> + core::marker::Send + core::marker::Sync), alloc::alloc::Global>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/alloc/src/boxed.rs:2029:9
  12:     0x55e51f00eeeb - {closure#0}
                               at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:360:9
  13:     0x55e51f36221a - <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::hbc5ccf4eb663e1e5
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/alloc/src/boxed.rs:2029:9
  14:     0x55e51f36221a - std::panicking::rust_panic_with_hook::h095fccf1dc9379ee
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:783:13
  15:     0x55e51f361f68 - std::panicking::begin_panic_handler::{{closure}}::h032ba12139b353db
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:649:13
  16:     0x55e51f361ef6 - std::sys_common::backtrace::__rust_end_short_backtrace::h9259bc2ff8fd0f76
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:171:18
  17:     0x55e51f361eef - rust_begin_unwind
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
  18:     0x55e51eee9074 - core::panicking::panic_fmt::h784f20a50eaab275
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
  19:     0x55e51eee9242 - core::panicking::panic::hb837a5ebbbe5b188
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:144:5
  20:     0x55e51f1e87d6 - simple_indices
                               at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/common/resources/descriptor.rs:112:9
  21:     0x55e51f1e87d6 - parse_cpu_definition
                               at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/worker/parser.rs:15:19
  22:     0x55e51f2273a9 - call<fn(&str) -> core::result::Result<tako::internal::common::resources::descriptor::ResourceDescriptorKind, anyhow::Error>, (&str)>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:79:5
  23:     0x55e51f2273a9 - parse_ref<fn(&str) -> core::result::Result<tako::internal::common::resources::descriptor::ResourceDescriptorKind, anyhow::Error>, tako::internal::common::resources::descriptor::ResourceDescriptorKind, anyhow::Error>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/value_parser.rs:928:25
  24:     0x55e51f2273a9 - parse_ref<tako::internal::common::resources::descriptor::ResourceDescriptorKind>
                               at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/client/utils.rs:56:9
  25:     0x55e51f226d75 - parse_ref_<hyperqueue::client::utils::PassthroughParser<tako::internal::common::resources::descriptor::ResourceDescriptorKind>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/value_parser.rs:773:9
  26:     0x55e51f226d75 - parse_ref_<hyperqueue::client::utils::PassThroughArgument<tako::internal::common::resources::descriptor::ResourceDescriptorKind>, hyperqueue::client::utils::PassthroughParser<tako::internal::common::resources::descriptor::ResourceDescriptorKind>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/value_parser.rs:658:25
  27:     0x55e51f08a91e - parse_ref
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/value_parser.rs:242:9
  28:     0x55e51f08a91e - push_arg_values
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:1083:27
  29:     0x55e51f070da7 - react
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:1192:21
  30:     0x55e51f06fdad - parse_opt_value
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:1037:36
  31:     0x55e51f067962 - parse_long_arg
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:801:17
  32:     0x55e51f067962 - get_matches_with
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:115:44
  33:     0x55e51f06d61e - parse_subcommand
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:720:37
  34:     0x55e51f06d61e - get_matches_with
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:474:17
  35:     0x55e51f06d61e - parse_subcommand
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:720:37
  36:     0x55e51f06d61e - get_matches_with
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:474:17
  37:     0x55e51f062f51 - _do_parse
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/command.rs:4000:29
  38:     0x55e51f0069c2 - try_get_matches_from_mut<std::env::ArgsOs, std::ffi::os_str::OsString>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/command.rs:830:9
  39:     0x55e51f0069c2 - get_matches_from<std::env::ArgsOs, std::ffi::os_str::OsString>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/command.rs:701:9
  40:     0x55e51f0069c2 - get_matches
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/command.rs:610:9
  41:     0x55e51f0069c2 - {async_block#0}
                               at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:375:19
  42:     0x55e51eff483d - poll<&mut hq::main::{async_block_env#0}>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/future/future.rs:124:9
  43:     0x55e51eff483d - {closure#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:659:57
  44:     0x55e51eff483d - with_budget<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure#0}::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/coop.rs:107:5
  45:     0x55e51eff483d - budget<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure#0}::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/coop.rs:73:5
  46:     0x55e51eff483d - {closure#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:659:25
  47:     0x55e51eff483d - enter<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:404:19
  48:     0x55e51eff483d - {closure#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:658:36
  49:     0x55e51eff483d - {closure#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:737:68
  50:     0x55e51eff483d - set<tokio::runtime::scheduler::Context, tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context/scoped.rs:40:9
  51:     0x55e51eff483d - {closure#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context.rs:176:26
  52:     0x55e51eff483d - try_with<tokio::runtime::context::Context, tokio::runtime::context::set_scheduler::{closure_env#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:270:16
  53:     0x55e51eff483d - with<tokio::runtime::context::Context, tokio::runtime::context::set_scheduler::{closure_env#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:246:9
  54:     0x55e51eff483d - set_scheduler<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context.rs:176:17
  55:     0x55e51eff483d - enter<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:737:27
  56:     0x55e51eff483d - block_on<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:646:19
  57:     0x55e51eff483d - {closure#0}<hq::main::{async_block_env#0}>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:175:28
  58:     0x55e51eff483d - enter_runtime<tokio::runtime::scheduler::current_thread::{impl#0}::block_on::{closure_env#0}<hq::main::{async_block_env#0}>, core::result::Result<(), hyperqueue::common::error::HqError>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context/runtime.rs:65:16
  59:     0x55e51eff483d - block_on<hq::main::{async_block_env#0}>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:167:9
  60:     0x55e51eff483d - block_on<hq::main::{async_block_env#0}>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/runtime.rs:348:47
  61:     0x55e51eff483d - main
                               at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:456:5
  62:     0x55e51ef80203 - call_once<fn() -> core::result::Result<(), hyperqueue::common::error::HqError>, ()>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:250:5
  63:     0x55e51ef80203 - __rust_begin_short_backtrace<fn() -> core::result::Result<(), hyperqueue::common::error::HqError>, core::result::Result<(), hyperqueue::common::error::HqError>>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:155:18
  64:     0x55e51f00f500 - main
  65:     0x7f6009377d90 - <unknown>
  66:     0x7f6009377e40 - __libc_start_main
  67:     0x55e51ef29049 - <unknown>
  68:                0x0 - <unknown>
Oops, HyperQueue has crashed. This is a bug, sorry for that.
If you would be so kind, please report this issue at the HQ issue tracker: https://github.com/It4innovations/hyperqueue/issues/new?title=HQ%20crashes
Please include the above error (starting from "thread ... panicked ...") and the stack backtrace in the issue contents, along with the following information:
@edan-bainglass edan-bainglass added the bug Something isn't working label Mar 1, 2025
@edan-bainglass
Copy link
Member Author

@superstar54 as you suggested, we debug Monday 👍

@edan-bainglass
Copy link
Member Author

Okay. Issue seems to be that, some or all of the following commands are system-dependent:

CPU_QUOTA_PATH="/sys/fs/cgroup/cpu/cpu.cfs_quota_us"
CPU_PERIOD_PATH="/sys/fs/cgroup/cpu/cpu.cfs_period_us"
MEMORY_LIMIT_PATH="/sys/fs/cgroup/memory/memory.limit_in_bytes"

For example, @superstar54's Linux system does not have /sys/fs/cgroup/cpu/cpu.cfs_quota_us. My WSL (Linux on Windows) has /sys/fs/cgroup/cpu/cpu.cfs_quota_us, but it is -1 instead of the actual number of cpus. As @superstar54 suggested, we can catch negative values and then use the default.

But maybe not general enough?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant