Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random ports and Dockerized Ignition #166

Open
joxoby opened this issue Aug 27, 2020 · 10 comments
Open

Random ports and Dockerized Ignition #166

joxoby opened this issue Aug 27, 2020 · 10 comments
Labels
question Further information is requested

Comments

@joxoby
Copy link
Contributor

joxoby commented Aug 27, 2020

There's a use-case in which we want to have an Ignition Gazebo instance running inside a Docker container and communicate with the host machine through the ignition-transport layer.

One of the problems is that the ports used by NodeShared are randomly chosen during construction. This makes it difficult to expose the ports to the host machine during the container startup, since these aren't known a priori.

Questions

  • Was this use-case considered? If so, how would someone go around it?
  • If it wasn't considered, are there any interests in pursuing it?

Note: docker run has the option --network=host that will mount the host network stack on the container. While this approach solves the problem above, we would prefer to avoid it (it also adds other problems when running the GUI).

@diegoferigo
Copy link

The network drivers used by docker unfortunately never went along well with any robotic middleware (ROS, YARP, etc). I'm not surprised if also Ignition Gazebo shares the same limitations. Without removing the network isolation, the only possible workaround I'm aware of is opening a range of ports large enough and hoping that no ports are allocated outside that range.

With that being said, in my team we're running Ignition Gazebo in containers based on Ubuntu 18:04 for a while now (and since few days also on 20.04), and with host network we do not have experienced any GUI problem. If for you the network host modality is not a strict blocker, I'd suggest to go along with it and perhaps solve the GUI problems you faced. I know that few groups use a VPN as workaround, allowing to scale also to bigger cluster e.g. in Kubernetes. If you want to maintain the network isolation I fear this is the only workaround, that however centralizes all the traffic between the nodes in a single point that could become a bottleneck.

@joxoby
Copy link
Contributor Author

joxoby commented Aug 27, 2020

The network drivers used by docker unfortunately never went along well with any robotic middleware (ROS, YARP, etc). I'm not surprised if also Ignition Gazebo shares the same limitations.

Since ignition-transport is based on ZMQ over TCP, I don't think that there should be any fundamental problems regarding Docker's network drivers.

Without removing the network isolation, the only possible workaround I'm aware of is opening a range of ports large enough and hoping that no ports are allocated outside that range.

We could also try restricting the range of permitted ports to, let's say, 100. That's a more manageable number and would make exposing them easier.

With that being said, in my team we're running Ignition Gazebo in containers based on Ubuntu 18:04 for a while now (and since few days also on 20.04), and with host network we do not have experienced any GUI problem.

That's interesting to hear. Adding the --network=host here will cause this error when trying to run with the GUI:

dbus[10]: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Most likely, the application was supposed to call dbus_connection_close(), since this is a private connection.
  D-Bus not built with -rdynamic so unable to print a backtrace

If you want to maintain the network isolation I fear this is the only workaround, that however centralizes all the traffic between the nodes in a single point that could become a bottleneck.

For each ignition-transport instance, there's a NodeShared singleton that will open a total of 4 ports. Every Node instance is basically a wrapper around that singleton and will use the same 4 ports as the other ones. What I'm trying to say is that the traffic is already somewhat centralized.

I'm curious to see what your Docker configuration is, and if it's not different from the official one, our problem might be specific to our system.

@joxoby
Copy link
Contributor Author

joxoby commented Aug 28, 2020

An implementation to restrict the range would look something like this:

void bindSocketToPortInRange(zmq::socket_t& _socket, const std::string& _ip, int _minPort, int _maxPort)
{
  int port = _minPort;
  while (true)
  {
    try
    {
      auto fullAddress = _ip + ":" + std::to_string(port);
      _socket.bind(fullAddress.c_str());
      return;
    }
    catch (...)
    {
      port++;
      if (port > _maxPort)
      {
        throw std::runtime_error("No available ports in specified range.");
      }
      continue;
    }
  }
}

We can add this feature via an environment variable IGN_PORT_RANGE, that when set, the transport will restrict the ports to the provided range. E.g.: IGN_PORT_RANGE=40000:40100

@diegoferigo
Copy link

Since ignition-transport is based on ZMQ over TCP, I don't think that there should be any fundamental problems regarding Docker's network drivers.

We could also try restricting the range of permitted ports to, let's say, 100. That's a more manageable number and would make exposing them easier.

Yes, I'm not saying that they do not work. The reality is that all of them use a dynamic allocation of ports and it fights with the default network isolation of docker. Opening wide ports range is a workaround with limitations (moby/moby#14288).

Adding the --network=host here will cause this error when trying to run with the GUI

I'm not using the official docker images and the error you posted rings a bell even though I'm not sure where I stumbled upon it in the past. Could you try to use the --init flag? My configuration is a bit complicated because we use a big docker image as portable team-wise development environment with deps and IDEs (therefore is quite heavy), think of it as a docker-based VM :)

We can add this feature via an environment variable IGN_PORT_RANGE, that when set, the transport will restrict the ports to the provided range. E.g.: IGN_PORT_RANGE=40000:40100

I let the developers to chime in here, I'm not an expert of the code of ign-transport.

@caguero
Copy link
Collaborator

caguero commented Aug 31, 2020

Before considering further changes in the code I'd like to verify that there's an actual issue:

  1. Did you verify that the issue is not related with having different partition names in the guest and in the host? The default partition name is created using a combination of the machine name and username. Unless you specify IGN_PARTITION manually in both sides, it's almost guaranteed that the default partition names will be different (and communication will not work).

  2. Did you see this tutorial?

https://ignitionrobotics.org/api/transport/9.0/relay.html

It looks like a simplified case of what you're trying to achieve and I remember being able to communicate using Ignition Transport with a Docker container.

Do you mind to verify these two aspects?

@joxoby
Copy link
Contributor Author

joxoby commented Aug 31, 2020

Did you verify that the issue is not related with having different partition names in the guest and in the host? The default partition name is created using a combination of the machine name and username. Unless you specify IGN_PARTITION manually in both sides, it's almost guaranteed that the default partition names will be different (and communication will not work).

Yes, I'm taking care of this. Just to be clear, I'm able to communicate with the container by using the option --network=host.

Did you see this tutorial?
https://ignitionrobotics.org/api/transport/9.0/relay.html

I tried that tutorial without success. I'm somewhat skeptical that the tutorial should work based on:

By default, when you create a container, it does not publish any of its ports to the outside world. To make a port available to services outside of Docker, or to Docker containers which are not connected to the container’s network, use the --publish or -p flag. This creates a firewall rule which maps a container port to a port on the Docker host.

https://docs.docker.com/config/containers/container-networking/#published-ports

Are you positive that it is working for you?

@joxoby
Copy link
Contributor Author

joxoby commented Sep 1, 2020

Furthermore, the Docker network bridge (the configuration used by default) does not support multicast, so discovery won't work either. It seems to me that the only way to connect to inside the container while preserving network isolation (not using --network=host) is to create a macvlan network: https://docs.docker.com/network/macvlan/.

@diegoferigo
Copy link

That's interesting to hear. Adding the --network=host here will cause this error when trying to run with the GUI:

dbus[10]: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Most likely, the application was supposed to call dbus_connection_close(), since this is a private connection.
 D-Bus not built with -rdynamic so unable to print a backtrace

I'm not using the official docker images and the error you posted rings a bell even though I'm not sure where I stumbled upon it in the past.

FYI I found the bell: moby/moby#38442 :) I'm still using the workaround, not sure if it's still necessary (from your error, I suppose it is).

@joxoby
Copy link
Contributor Author

joxoby commented Sep 2, 2020

Thanks, Diego. I ended up arriving at the same solution. Nevertheless, I will keep this issue open so we can clarify some of the Docker networking issues.

@caguero
Copy link
Collaborator

caguero commented Sep 7, 2020

Furthermore, the Docker network bridge (the configuration used by default) does not support multicast, so discovery won't work either. It seems to me that the only way to connect to inside the container while preserving network isolation (not using --network=host) is to create a macvlan network: https://docs.docker.com/network/macvlan/.

When IGN_RELAY is set, the discovery layer forwards all discovery information to the relays via unicast. Also, the communication is bidirectional. When a relay receives a unicast communication, it saves the endpoint of the sender because it's used for sending future discovery updates.

@chapulina chapulina added the question Further information is requested label Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants