[scenario] Anticipating port conflicts with a local active server queue on a shared computer #26
-
My colleagues and I work with a traditional SGE cluster which has a small handful of login nodes and several compute nodes. Everyone shares the nodes, particularly the login nodes, but even the compute nodes (even when specific cores are reserved). Let's say Alice and Bob each log into the login node with IP address 192.168.0.2 on the cluster's local network. Each uses a local active server queue to run a large pipeline. Alice listens for servers on client ports 50000 through 50999 and launches 1000 servers to dial into them: daemons("tcp://192.168.0.2:50000", nodes = 1000) Similarly, Bob listens for servers on client ports 50050 through 51049 and launches 1000 servers to dial into them: daemons("tcp://192.168.0.2:50050", nodes = 1000) Alice and Bob both launch tasks which go to servers that dial into port 50100. mirai_alice <- mirai("Alice's task")
What will |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 19 replies
-
I anticipate similar situations when a |
Beta Was this translation helpful? Give feedback.
-
There is no way for both Bob and Alice to listen on the same port. If they run nodes directly they will get: Bob: daemons("tcp://192.168.0.2:50000")
#> [1] "remote" Alice: daemons("tcp://192.168.0.2:50000")
#> Error in socket(protocol = "req", listen = value) : 10 | Address in use If they add the nodes argument, they don't see the above error as it is the background process that listens. However for Alice, her background queue process errors and exits. The easiest way of detecting this is simply querying |
Beta Was this translation helpful? Give feedback.
There is no way for both Bob and Alice to listen on the same port.
If they run nodes directly they will get:
Bob:
Alice:
If they add the nodes argument, they don't see the above error as it is the background process that listens.
However for Alice, her background queue process errors and exits. The easiest way of detecting this is simply querying
daemons()
, which will show 0 connections instead of 1.