Description
I've been able to manually launch a Dask GPU cluster using Slurm successfully. The setup for that looks like:
<get allocation>
dask-scheduler --scheduler-file scheduler.json &
srun -G <number-of-gpus-in-allocation> dask-cuda-worker --scheduler-file scheduler.json
This works OK for say, 2 nodes with 4 GPUs each but I find that when I request more workers, say, 80 GPUs, some of them just never come up, or they take an inordinately long amount of time to come up. I had this problem with manual start in a CPU-only context with Dask Distributed, and the fix for that kind of on-demand scale was dask-mpi
. Combined with containers, this is a very good solution we've found for reliable Dask startup on HPC. I'm hoping for the same kind of solution with GPUs.
I've tried out dask-mpi
support for dask_cuda.CUDAWorker
but I can't seem to find a working invocation. I tried just adapting the above, or using something similar to what I've done before:
srun -G 40 -u python -u $(which dask-mpi) --scheduler-file scheduler.json \
--nthreads 1 --worker-class dask_cuda.CUDAWorker --dashboard-address 0
The first thing I noticed is that I have to specify --nthreads 1
because otherwise you get an error which might be easy to fix:
TypeError: '<' not supported between instances of 'NoneType' and 'int'
return self.main(*args, **kwargs)
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/dask_mpi/cli.py", line 147, in main
asyncio.get_event_loop().run_until_complete(run_worker())
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/dask_mpi/cli.py", line 144, in run_worker
async with WorkerType(**opts) as worker:
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/dask_cuda/cuda_worker.py", line 95, in __init__
if nthreads < 1:
TypeError: '<' not supported between instances of 'NoneType' and 'int'
Once you have --nthreads 1
in place though, you hit this:
Traceback (most recent call last):
File "/pscratch/sd/r/rthomas/dask/env/bin/dask-mpi", line 8, in <module>
sys.exit(go())
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/dask_mpi/cli.py", line 152, in go
main()
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/dask_mpi/cli.py", line 147, in main
asyncio.get_event_loop().run_until_complete(run_worker())
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/dask_mpi/cli.py", line 144, in run_worker
async with WorkerType(**opts) as worker:
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/dask_cuda/cuda_worker.py", line 216, in __init__
self.nannies = [
File "/pscratch/sd/r/rthomas/dask/env/lib/python3.8/site-packages/dask_cuda/cuda_worker.py", line 217, in <listcomp>
Nanny(
TypeError: __init__() got multiple values for argument 'scheduler_ip'
as if a keyword arg is trying to overwrite a positional one. I started wondering how people were trying this out themselves and thought maybe they weren't going with a scheduler file for the use cases tested up till now, but I had trouble properly specifying options and arguments for dask_cuda.CUDAWorker
via --worker-options
(they seem to not matter) or specifying SCHEDULER_ADDRESS
on the command line.
My conda env, based off RAPIDS stable
# packages in environment at /pscratch/sd/r/rthomas/dask/env:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main defaults
_openmp_mutex 4.5 1_gnu defaults
abseil-cpp 20210324.2 h9c3ff4c_0 conda-forge
aiohttp 3.7.4.post0 py38h497a2fe_0 conda-forge
anyio 3.3.0 py38h578d9bd_0 conda-forge
appdirs 1.4.4 pyh9f0ad1d_0 conda-forge
argon2-cffi 20.1.0 py38h497a2fe_2 conda-forge
arrow-cpp 4.0.1 py38h9a0cccc_7_cuda conda-forge
arrow-cpp-proc 3.0.0 cuda conda-forge
async-timeout 3.0.1 py_1000 conda-forge
async_generator 1.10 py_0 conda-forge
attrs 21.2.0 pyhd8ed1ab_0 conda-forge
aws-c-cal 0.5.11 h95a6274_0 conda-forge
aws-c-common 0.6.2 h7f98852_0 conda-forge
aws-c-event-stream 0.2.7 h3541f99_13 conda-forge
aws-c-io 0.10.5 hfb6a706_0 conda-forge
aws-checksums 0.1.11 ha31a3da_7 conda-forge
aws-sdk-cpp 1.8.186 hb4091e7_3 conda-forge
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 py_2 conda-forge
backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge
bleach 4.1.0 pyhd8ed1ab_0 conda-forge
blosc 1.21.0 h9c3ff4c_0 conda-forge
bokeh 2.3.3 py38h578d9bd_0 conda-forge
boost 1.74.0 py38hc10631b_3 conda-forge
boost-cpp 1.74.0 h312852a_4 conda-forge
brotli 1.0.9 h7f98852_5 conda-forge
brotli-bin 1.0.9 h7f98852_5 conda-forge
brotlipy 0.7.0 py38h497a2fe_1001 conda-forge
brunsli 0.1 h9c3ff4c_0 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.17.1 h7f98852_1 conda-forge
ca-certificates 2021.5.30 ha878542_0 conda-forge
cachetools 4.2.2 pyhd8ed1ab_0 conda-forge
cairo 1.16.0 h6cf1ce9_1008 conda-forge
certifi 2021.5.30 py38h578d9bd_0 conda-forge
cffi 1.14.6 py38ha65f79e_0 conda-forge
cfitsio 3.470 hb418390_7 conda-forge
chardet 4.0.0 py38h578d9bd_1 conda-forge
charls 2.2.0 h9c3ff4c_0 conda-forge
charset-normalizer 2.0.4 pyhd3eb1b0_0 defaults
click 7.1.2 pyh9f0ad1d_0 conda-forge
click-plugins 1.1.1 py_0 conda-forge
cligj 0.7.2 pyhd8ed1ab_0 conda-forge
cloudpickle 1.6.0 py_0 conda-forge
colorcet 2.0.6 pyhd8ed1ab_0 conda-forge
conda 4.10.3 py38h578d9bd_0 conda-forge
conda-package-handling 1.7.3 py38h497a2fe_0 conda-forge
cryptography 3.4.7 py38ha5dfef3_0 conda-forge
cucim 21.08.01 cuda_11.0_py38_ga89f250_0 rapidsai
cudatoolkit 11.0.221 h6bb024c_0 nvidia
cudf 21.08.02 cuda_11.0_py38_gf6d31fa95d_0 rapidsai
cudf_kafka 21.08.02 py38_gf6d31fa95d_0 rapidsai
cugraph 21.08.03 cuda11.0_py38_g9e9f1570_0 rapidsai
cuml 21.08.01 cuda11.0_py38_g5c0e99300_0 rapidsai
cupy 9.0.0 py38hc350bd8_0 conda-forge
curl 7.78.0 hea6ffbf_0 conda-forge
cusignal 21.08.00 py37_g33f663e_0 rapidsai
cuspatial 21.08.01 py38_g7c0151b_0 rapidsai
custreamz 21.08.02 py38_gf6d31fa95d_0 rapidsai
cuxfilter 21.08.00 py38_g274c584_0 rapidsai
cycler 0.10.0 py_2 conda-forge
cyrus-sasl 2.1.27 h230043b_2 conda-forge
cytoolz 0.11.0 py38h497a2fe_3 conda-forge
dask 2021.7.1 pyhd8ed1ab_0 conda-forge
dask-core 2021.7.1 pyhd8ed1ab_0 conda-forge
dask-cuda 21.08.00 py38_0 rapidsai
dask-cudf 21.08.02 py38_gf6d31fa95d_0 rapidsai
dask-mpi 2.21.0+49.gccacb62 pypi_0 pypi
datashader 0.11.1 pyh9f0ad1d_0 conda-forge
datashape 0.5.4 py_1 conda-forge
debugpy 1.4.1 py38h709712a_0 conda-forge
decorator 4.4.2 py_0 conda-forge
defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge
distributed 2021.7.1 py38h578d9bd_0 conda-forge
dlpack 0.5 h9c3ff4c_0 conda-forge
entrypoints 0.3 pyhd8ed1ab_1003 conda-forge
expat 2.4.1 h9c3ff4c_0 conda-forge
faiss-proc 1.0.0 cuda rapidsai
fastavro 1.4.4 py38h497a2fe_0 conda-forge
fastrlock 0.6 py38h709712a_1 conda-forge
fiona 1.8.20 py38hdb5a769_0 conda-forge
fontconfig 2.13.1 hba837de_1005 conda-forge
freetype 2.10.4 h0708190_1 conda-forge
freexl 1.0.6 h7f98852_0 conda-forge
fsspec 2021.8.1 pyhd8ed1ab_0 conda-forge
gdal 3.2.2 py38h507a4fd_7 conda-forge
geopandas 0.9.0 pyhd8ed1ab_1 conda-forge
geopandas-base 0.9.0 pyhd8ed1ab_1 conda-forge
geos 3.9.1 h9c3ff4c_2 conda-forge
geotiff 1.6.0 h4f31c25_6 conda-forge
gettext 0.19.8.1 h0b5b191_1005 conda-forge
gflags 2.2.2 he1b5a44_1004 conda-forge
giflib 5.2.1 h36c2ea0_2 conda-forge
glog 0.5.0 h48cff8f_0 conda-forge
grpc-cpp 1.39.0 hf1f433d_2 conda-forge
hdf4 4.2.15 h10796ff_3 conda-forge
hdf5 1.10.6 nompi_h6a2412b_1114 conda-forge
heapdict 1.0.1 py_0 conda-forge
icu 68.1 h58526e2_0 conda-forge
idna 3.2 pyhd3eb1b0_0 defaults
imagecodecs 2021.7.30 py38hb5ce8f7_0 conda-forge
imageio 2.9.0 py_0 conda-forge
importlib-metadata 4.8.1 py38h578d9bd_0 conda-forge
ipykernel 6.3.1 py38he5a9106_0 conda-forge
ipython 7.27.0 py38he5a9106_0 conda-forge
ipython_genutils 0.2.0 py_1 conda-forge
ipywidgets 7.6.4 pyhd8ed1ab_0 conda-forge
jbig 2.1 h7f98852_2003 conda-forge
jedi 0.18.0 py38h578d9bd_2 conda-forge
jinja2 3.0.1 pyhd8ed1ab_0 conda-forge
joblib 1.0.1 pyhd8ed1ab_0 conda-forge
jpeg 9d h36c2ea0_0 conda-forge
json-c 0.15 h98cffda_0 conda-forge
jsonschema 3.2.0 pyhd8ed1ab_3 conda-forge
jupyter-server-proxy 3.1.0 pyhd8ed1ab_0 conda-forge
jupyter_client 7.0.2 pyhd8ed1ab_0 conda-forge
jupyter_core 4.7.1 py38h578d9bd_0 conda-forge
jupyter_server 1.10.2 pyhd8ed1ab_0 conda-forge
jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge
jupyterlab_widgets 1.0.1 pyhd8ed1ab_0 conda-forge
jxrlib 1.1 h7f98852_2 conda-forge
kealib 1.4.14 hcc255d8_2 conda-forge
kiwisolver 1.3.1 py38h1fd1430_1 conda-forge
krb5 1.19.2 hcc1bbae_0 conda-forge
lcms2 2.12 hddcbb42_0 conda-forge
ld_impl_linux-64 2.35.1 h7274673_9 defaults
lerc 2.2.1 h9c3ff4c_0 conda-forge
libaec 1.0.5 h9c3ff4c_0 conda-forge
libblas 3.9.0 11_linux64_openblas conda-forge
libbrotlicommon 1.0.9 h7f98852_5 conda-forge
libbrotlidec 1.0.9 h7f98852_5 conda-forge
libbrotlienc 1.0.9 h7f98852_5 conda-forge
libcblas 3.9.0 11_linux64_openblas conda-forge
libcucim 21.08.01 cuda11.0_ga89f250_0 rapidsai
libcudf 21.08.02 cuda11.0_gf6d31fa95d_0 rapidsai
libcudf_kafka 21.08.02 gf6d31fa95d_0 rapidsai
libcugraph 21.08.03 cuda11.0_g9e9f1570_0 rapidsai
libcuml 21.08.01 cuda11.0_g5c0e99300_0 rapidsai
libcumlprims 21.08.00 cuda11.0_g9c188ef_0 nvidia
libcurl 7.78.0 h2574ce0_0 conda-forge
libcuspatial 21.08.01 cuda11.0_g7c0151b_0 rapidsai
libdap4 3.20.6 hd7c4107_2 conda-forge
libdeflate 1.8 h7f98852_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libevent 2.1.10 hcdb4288_3 conda-forge
libfaiss 1.7.0 cuda110h8045045_8_cuda conda-forge
libffi 3.3 he6710b0_2 defaults
libgcc-ng 9.3.0 h5101ec6_17 defaults
libgcrypt 1.9.3 h7f98852_1 conda-forge
libgdal 3.2.2 h8f005ca_7 conda-forge
libgfortran-ng 11.1.0 h69a702a_8 conda-forge
libgfortran5 11.1.0 h6c583b3_8 conda-forge
libglib 2.68.3 h3e27bee_0 conda-forge
libgomp 9.3.0 h5101ec6_17 defaults
libgpg-error 1.42 h9c3ff4c_0 conda-forge
libgsasl 1.8.0 2 conda-forge
libhwloc 2.3.0 h5e5b7d1_1 conda-forge
libiconv 1.16 h516909a_0 conda-forge
libkml 1.3.0 h238a007_1014 conda-forge
liblapack 3.9.0 11_linux64_openblas conda-forge
libllvm10 10.0.1 he513fc3_3 conda-forge
libnetcdf 4.8.0 nompi_hcd642e3_103 conda-forge
libnghttp2 1.43.0 h812cca2_0 conda-forge
libntlm 1.4 h7f98852_1002 conda-forge
libopenblas 0.3.17 pthreads_h8fe5266_1 conda-forge
libpng 1.6.37 h21135ba_2 conda-forge
libpq 13.3 hd57d9b9_0 conda-forge
libprotobuf 3.16.0 h780b84a_0 conda-forge
librdkafka 1.6.1 hc49e61c_1 conda-forge
librmm 21.08.01 cuda11.0_g66cf439_0 rapidsai
librttopo 1.1.0 h1185371_6 conda-forge
libsodium 1.0.18 h36c2ea0_1 conda-forge
libspatialindex 1.9.3 h9c3ff4c_4 conda-forge
libspatialite 5.0.1 h8694cbe_5 conda-forge
libssh2 1.9.0 ha56f1ee_6 conda-forge
libstdcxx-ng 9.3.0 hd4cf53a_17 defaults
libthrift 0.14.2 he6d91bd_1 conda-forge
libtiff 4.3.0 hf544144_0 conda-forge
libutf8proc 2.6.1 h7f98852_0 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libuv 1.42.0 h7f98852_0 conda-forge
libwebp 1.2.0 h3452ae3_0 conda-forge
libwebp-base 1.2.0 h7f98852_2 conda-forge
libxcb 1.13 h7f98852_1003 conda-forge
libxgboost 1.4.2dev.rapidsai21.08 cuda11.0_0 rapidsai
libxml2 2.9.12 h72842e0_0 conda-forge
libzip 1.8.0 h4de3113_0 conda-forge
libzopfli 1.0.3 h9c3ff4c_0 conda-forge
llvmlite 0.36.0 py38h4630a5e_0 conda-forge
locket 0.2.0 py_2 conda-forge
lz4-c 1.9.3 h9c3ff4c_1 conda-forge
mapclassify 2.4.3 pyhd8ed1ab_0 conda-forge
markdown 3.3.4 pyhd8ed1ab_0 conda-forge
markupsafe 2.0.1 py38h497a2fe_0 conda-forge
matplotlib-base 3.4.2 py38hcc49a3a_0 conda-forge
matplotlib-inline 0.1.2 pyhd8ed1ab_2 conda-forge
mistune 0.8.4 py38h497a2fe_1004 conda-forge
mpi4py 3.1.1 pypi_0 pypi
msgpack-python 1.0.2 py38h1fd1430_1 conda-forge
multidict 5.1.0 py38h497a2fe_1 conda-forge
multipledispatch 0.6.0 py_0 conda-forge
munch 2.5.0 py_0 conda-forge
nbclient 0.5.4 pyhd8ed1ab_0 conda-forge
nbconvert 6.1.0 py38h578d9bd_0 conda-forge
nbformat 5.1.3 pyhd8ed1ab_0 conda-forge
nccl 2.10.3.1 h96e36e3_0 conda-forge
ncurses 6.2 he6710b0_1 defaults
nest-asyncio 1.5.1 pyhd8ed1ab_0 conda-forge
networkx 2.6.2 pyhd8ed1ab_0 conda-forge
nodejs 14.17.4 h92b4a50_0 conda-forge
notebook 6.4.3 pyha770c72_0 conda-forge
numba 0.53.1 py38h8b71fd7_1 conda-forge
numpy 1.21.1 py38h9894fe3_0 conda-forge
nvtx 0.2.3 py38h497a2fe_0 conda-forge
olefile 0.46 pyh9f0ad1d_1 conda-forge
openjpeg 2.4.0 hb52868f_1 conda-forge
openssl 1.1.1k h7f98852_0 conda-forge
orc 1.6.9 h58a87f1_0 conda-forge
packaging 21.0 pyhd8ed1ab_0 conda-forge
pandas 1.2.5 py38h1abd341_0 conda-forge
pandoc 2.14.2 h7f98852_0 conda-forge
pandocfilters 1.4.2 py_1 conda-forge
panel 0.12.1 pyhd8ed1ab_0 conda-forge
param 1.11.1 pyh6c4a22f_0 conda-forge
parquet-cpp 1.5.1 2 conda-forge
parso 0.8.2 pyhd8ed1ab_0 conda-forge
partd 1.2.0 pyhd8ed1ab_0 conda-forge
pcre 8.45 h9c3ff4c_0 conda-forge
pexpect 4.8.0 pyh9f0ad1d_2 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 8.3.1 py38h8e6f84c_0 conda-forge
pip 21.2.4 pyhd8ed1ab_0 conda-forge
pixman 0.40.0 h36c2ea0_0 conda-forge
pooch 1.5.1 pyhd8ed1ab_0 conda-forge
poppler 21.03.0 h93df280_0 conda-forge
poppler-data 0.4.10 0 conda-forge
postgresql 13.3 h2510834_0 conda-forge
proj 8.0.1 h277dcde_0 conda-forge
prometheus_client 0.11.0 pyhd8ed1ab_0 conda-forge
prompt-toolkit 3.0.20 pyha770c72_0 conda-forge
protobuf 3.16.0 py38h709712a_0 conda-forge
psutil 5.8.0 py38h497a2fe_1 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
py-xgboost 1.4.2dev.rapidsai21.08 cuda11.0py38_0 rapidsai
pyarrow 4.0.1 py38hdd2221d_7_cuda conda-forge
pycosat 0.6.3 py38h497a2fe_1006 conda-forge
pycparser 2.20 py_2 defaults
pyct 0.4.6 py_0 conda-forge
pyct-core 0.4.6 py_0 conda-forge
pydeck 0.5.0 pyh9f0ad1d_0 conda-forge
pygments 2.10.0 pyhd8ed1ab_0 conda-forge
pynvml 11.0.0 pyhd8ed1ab_0 conda-forge
pyopenssl 20.0.1 pyhd3eb1b0_1 defaults
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pyproj 3.1.0 py38h03a1999_3 conda-forge
pyrsistent 0.17.3 py38h497a2fe_2 conda-forge
pysocks 1.7.1 py38h578d9bd_3 conda-forge
python 3.8.10 h49503c6_1_cpython conda-forge
python-confluent-kafka 1.6.0 py38h497a2fe_1 conda-forge
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python_abi 3.8 2_cp38 conda-forge
pytz 2021.1 pyhd8ed1ab_0 conda-forge
pyviz_comms 2.1.0 pyhd8ed1ab_0 conda-forge
pywavelets 1.1.1 py38h5c078b8_3 conda-forge
pyyaml 5.4.1 py38h497a2fe_0 conda-forge
pyzmq 22.1.0 py38h2035c66_0 conda-forge
rapids 21.08.00 cuda11.0_py38_ga776927_74 rapidsai
rapids-xgboost 21.08.00 cuda11.0_py38_ga776927_74 rapidsai
re2 2021.08.01 h9c3ff4c_0 conda-forge
readline 8.1 h27cfd23_0 defaults
requests 2.26.0 pyhd3eb1b0_0 defaults
requests-unixsocket 0.2.0 py_0 conda-forge
rmm 21.08.01 cuda_11.0_py38_g66cf439_0 rapidsai
rtree 0.9.7 py38h02d302b_2 conda-forge
ruamel_yaml 0.15.80 py38h497a2fe_1004 conda-forge
s2n 1.0.10 h9b69904_0 conda-forge
scikit-image 0.18.1 py38h51da96c_0 conda-forge
scikit-learn 0.24.2 py38hdc147b9_0 conda-forge
scipy 1.7.0 py38h7b17777_1 conda-forge
send2trash 1.8.0 pyhd8ed1ab_0 conda-forge
setuptools 49.6.0 py38h578d9bd_3 conda-forge
shapely 1.7.1 py38haeee4fe_5 conda-forge
simpervisor 0.4 pyhd8ed1ab_0 conda-forge
six 1.16.0 pyhd3eb1b0_0 defaults
snappy 1.1.8 he1b5a44_3 conda-forge
sniffio 1.2.0 py38h578d9bd_1 conda-forge
sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge
spdlog 1.8.5 h4bd325d_0 conda-forge
sqlite 3.36.0 hc218d9a_0 defaults
streamz 0.6.2 pyh44b312d_0 conda-forge
tblib 1.7.0 pyhd8ed1ab_0 conda-forge
terminado 0.11.1 py38h578d9bd_0 conda-forge
testpath 0.5.0 pyhd8ed1ab_0 conda-forge
threadpoolctl 2.2.0 pyh8a188c0_0 conda-forge
tifffile 2021.8.30 pyhd8ed1ab_0 conda-forge
tiledb 2.3.2 he87e0bf_0 conda-forge
tk 8.6.10 hbc83047_0 defaults
toolz 0.11.1 py_0 conda-forge
tornado 6.1 py38h497a2fe_1 conda-forge
tqdm 4.62.1 pyhd3eb1b0_1 defaults
traitlets 5.1.0 pyhd8ed1ab_0 conda-forge
treelite 2.0.0 py38hc9ad5e7_0 conda-forge
treelite-runtime 2.0.0 pypi_0 pypi
typing-extensions 3.10.0.0 hd8ed1ab_0 conda-forge
typing_extensions 3.10.0.0 pyha770c72_0 conda-forge
tzcode 2021a h7f98852_2 conda-forge
tzdata 2021a h5d7bf9c_0 defaults
ucx 1.9.0+gcd9efd3 cuda11.0_0 rapidsai
ucx-proc 1.0.0 gpu rapidsai
ucx-py 0.21.0 py38_gcd9efd3_0 rapidsai
urllib3 1.26.6 pyhd3eb1b0_1 defaults
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
webencodings 0.5.1 py_1 conda-forge
websocket-client 0.57.0 py38h578d9bd_4 conda-forge
wheel 0.37.0 pyhd8ed1ab_1 conda-forge
widgetsnbextension 3.5.1 py38h578d9bd_4 conda-forge
xarray 0.19.0 pyhd8ed1ab_1 conda-forge
xerces-c 3.2.3 h9d8b166_2 conda-forge
xgboost 1.4.2dev.rapidsai21.08 cuda11.0py38_0 rapidsai
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libice 1.0.10 h7f98852_0 conda-forge
xorg-libsm 1.2.3 hd9c2040_1000 conda-forge
xorg-libx11 1.7.2 h7f98852_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h7f98852_1 conda-forge
xorg-libxrender 0.9.10 h7f98852_1003 conda-forge
xorg-renderproto 0.11.1 h7f98852_1002 conda-forge
xorg-xextproto 7.3.0 h7f98852_1002 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.5 h7b6447c_0 defaults
yaml 0.2.5 h7b6447c_0 defaults
yarl 1.6.3 py38h497a2fe_2 conda-forge
zeromq 4.3.4 h9c3ff4c_0 conda-forge
zfp 0.5.5 h9c3ff4c_5 conda-forge
zict 2.0.0 py_0 conda-forge
zipp 3.5.0 pyhd8ed1ab_0 conda-forge
zlib 1.2.11 h7b6447c_3 defaults
zstd 1.5.0 ha95c52a_0 conda-forge
For completeness, I did this to set up my Dask cluster env and Jupyter kernel env:
- Install miniconda
- Install RAPIDS
- Build mpi4py to link against pre-existing MPI (don't think this is part of the problem)
pip install --no-cache-dir --force git+https://github.com/dask/dask-mpi
(to keep from overwriting my mpi4py)