You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm experiencing an issue with the load distribution of parallel jobs executed with future_map. In particular, I observe that the first time I call future_map the workload is happening on 2-3 workers only, while on consecutive runs of the same call the workload is shared evenly across the workers.
I tried to narrow it down into a reprex:
library(future)
library(tictoc)
plan(multisession, workers=10)
tic()
res<-purrr::map(
.x=1:1e6, .f=~.x+1
)
toc()
# 1.92 sec not in paralle
tic()
furrr::future_map(
.x=1:1e6, .f=~.x+1
)
toc()
# 3.462 sec on first runmicrobenchmark::microbenchmark(
{
furrr::future_map(
.x=1:1e6, .f=~.x+1
)
},
times=20
)
# 1.2 secs on average on consecutive runs
In my "real-world" applications, where there is also a considerable amount of data to be passed to the workers, this tends to be more extreme.
Hello,
I'm experiencing an issue with the load distribution of parallel jobs executed with future_map. In particular, I observe that the first time I call future_map the workload is happening on 2-3 workers only, while on consecutive runs of the same call the workload is shared evenly across the workers.
I tried to narrow it down into a reprex:
In my "real-world" applications, where there is also a considerable amount of data to be passed to the workers, this tends to be more extreme.
This might be related to this previous issue:
#3
I'm working on the following system:
R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS
Any thoughts appreciated.
Best,
Maximilian
The text was updated successfully, but these errors were encountered: