cleanup duplicate ipvlan and macvlan network IDs during createNetwork #2055

eyz · 2018-01-16T21:06:28Z

cleanup duplicate ipvlan and macvlan network IDs during createNetwork to prevent collisions

check if the new network uses the same interface as an existing network, as different networks of this type cannot share the same interface
if the new network ID does not match an existing network ID then we must return with an error, as the new network is not the same and cannot co-exist with the existing
attempt to delete the old duplicate network, and return an error if unsuccessful, or log info if successful

This resolves the issue shown below, where an existing copy of a network has not been removed on Docker Engine shutdown, and still persists when the config-only network is created once again on a subsequent createNetwork call

existing network config = (*ipvlan.configuration)(0xc420322e80)({\n
  ID: (string) (len=25) \"hwqzu5i4kmrd2uzi34zpnpy1m\",\n
  Mtu: (int) 0,\n
  dbIndex: (uint64) 6,\n
  dbExists: (bool) true,\n
  Internal: (bool) false,\n
  Parent: (string) (len=4) \"eth0\",\n
  IpvlanMode: (string) (len=2) \"l2\",\n
  CreatedSlaveLink: (bool) false,\n
  Ipv4Subnets: ([]*ipvlan.ipv4Subnet) (len=1 cap=4) {\n
    (*ipvlan.ipv4Subnet)(0xc4206413a0)({\n
      SubnetIP: (string) (len=13) \"10.100.0.0/16\",\n
      GwIP: (string) (len=13) \"10.100.0.1/16\"\n
    })\n
  },\n
  Ipv6Subnets: ([]*ipvlan.ipv6Subnet) <nil>\n})\n

new network config = (*ipvlan.configuration)(0xc4203c5b80)({\n
  ID: (string) (len=25) \"hwqzu5i4kmrd2uzi34zpnpy1m\",\n
  Mtu: (int) 0,\n
  dbIndex: (uint64) 0,\n
  dbExists: (bool) false,\n
  Internal: (bool) false,\n
  Parent: (string) (len=4) \"eth0\",\n
  IpvlanMode: (string) (len=2) \"l2\",\n
  CreatedSlaveLink: (bool) false,\n
  Ipv4Subnets: ([]*ipvlan.ipv4Subnet) (len=1 cap=1) {\n
    (*ipvlan.ipv4Subnet)(0xc42149af00)({\n
      SubnetIP: (string) (len=13) \"10.100.0.0/16\",\n
      GwIP: (string) (len=13) \"10.100.0.1/16\"\n
    })\n
  },\n
  Ipv6Subnets: ([]*ipvlan.ipv6Subnet) <nil>\n})\n

The symptom of an existing and now-conflicting network is usually as follows, including showing up as a rejected task's error -

docker.err.log:time="2018-01-15T23:32:21.993891743Z" level=error msg="fatal task error" error="network di-hwqzu5i4kmrd is already using parent interface eth0

… to prevent collisions - check if the new network uses the same interface as an existing network, as different networks of this type cannot share the same interface - if the new network ID does not match an existing network ID then we must return with an error, as the new network is not the same and cannot co-exist with the existing - attempt to delete the old duplicate network, and return an error if unsuccessful, or log info if successful Signed-off-by: Isaac Rodman <[email protected]>

eyz · 2018-01-16T21:14:51Z

Mentioned this to @mavenugo quite a while ago. Not sure if cleanup can be done on the shutdown side, but I think this may be safer, to cleanup if needed on creation instead.

codecov-io · 2018-01-16T21:24:50Z

Codecov Report

❗ No coverage uploaded for pull request base (master@fcf1c3b). Click here to learn what that means.
The diff coverage is 0%.

@@           Coverage Diff            @@
##             master   #2055   +/-   ##
========================================
  Coverage          ?   40.5%           
========================================
  Files             ?     138           
  Lines             ?   22185           
  Branches          ?       0           
========================================
  Hits              ?    8985           
  Misses            ?   11877           
  Partials          ?    1323

Impacted Files	Coverage Δ
drivers/macvlan/macvlan_network.go	`0% <0%> (ø)`
drivers/ipvlan/ipvlan_network.go	`0% <0%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fcf1c3b...78df1c0. Read the comment docs.

fcrisciani · 2018-01-23T22:12:47Z

@eyz just a clarification, what do you mean with the existing copy not being deleted on daemon shutdown? I guess the state is persisted in the local store.

eyz · 2018-01-23T22:36:57Z

Hi, @fcrisciani , and thanks for looking at this issue! I can attempt to clarify further if needed, but the typical behavior is that - intermittently, when I stop Docker Engine with Swarm nodes that have instantiated a config-only ipvlan network in global/swarm mode, there is often a collision when the service which uses that network comes back up again -- when the container assigned to that service (and thus using that network from before) comes up, the network occasionally is apparently being created once again -- it appears there is a collision in createNetwork when this occurs. The existing behavior, prior to my PR, will simply throw an error if the network is being re-created -- the same network ID is already presumably in the store, and now the executor is trying to create the network once again, and so a fatal error occurs in createNetwork, which results in a task-scoped error to be presented, and the container cannot be started from the task.

With this PR, if the existing network ID is found, and is being created a second time, then the "old" network ID -- the duplicate from before -- is removed first, and then createNetwork can proceed with creation of the same network, which is also using the same network ID, as confirmed during my testing.

I fully expect that there is another side of this solution, where perhaps there can be proper cleanup at shutdown. I spoke with Madhu (@mavenugo) quite a while ago, and we found that there was a workaround for the issue by deleting the network store, and re-creating, but that approach isn't sustainable. I found that the linked PR will allow createNetwork to proceed without a fatal error, and thus the containers can use the existing network once again. A more creative solution may be the actual best answer, but this PR seems to work as a solution as well.

Is that enough detail to investigate this PR as a potential desired solution?

fcrisciani · 2018-01-23T23:20:08Z

So generally speaking I would prefer an approach similar to bridge driver, where simplistically there, if the bridge ifc is already created, will just simply use it.
Here the same way I would consider to start changing the if condition like this:
if config.Parent == nw.config.Parent && config.ID != nw.config.ID { ==> then error
Else we continue with the normal processing knowing that the network being created is actually the same.
Now arrives the tricky part, we have to check that the methods like: createDummyLink, createVlanLink, are not bailing out if the interface that are creating is already there.

…ed all changes as follows - - support sending hostname and domainname in IPAM - support custom volume naming when using netapp volume driver - use dashes as seperators for container names - work in-progress DNS domainname changes - de-duplicate DNS search domains - add container "host" /etc/hosts entry equal to value of DOCKER_HOST_EXPORT_IP environment variable if defined - cleanup duplicate ipvlan and macvlan network IDs during createNetwork (moby/libnetwork#2055) - support optional skip of IPAM pool conflict checking - set Swarm tasks to orphaned state if node becomes unavailable - allow orphaned Swarm tasks assigned to a deleted node to start elseware - Swarm tasks in an orphaned state should not be allowed to restart - allow Swarm tasks in a remove state to be transitioned - prevent oldTaskTimer from allowing the slot to be prematurely updated - allow Swarm tasks to be cleaned up if they are in a pending state and are marked for removal - support service level anti-affinity via label h3o-limitActiveServiceSlotsPerNode

eyz · 2018-02-09T04:43:43Z

@fcrisciani my organization is probably moving to Kubernetes, so please adopt or close as applicable. Thanks! That said, let me know if you need any more clarification, but I probably won't be submitting code updates to this PR

rnataraja · 2018-10-30T02:07:57Z

@fcrisciani Did you mean something like this? This isssue is very predictable.

https://gist.github.com/rnataraja/d987b61d8acc8ac738641f55ebe5af0d

eyz · 2019-01-10T00:30:31Z

@fcrisciani and @mavenugo, is anyone available to take this over?

eyz mentioned this pull request Jan 16, 2018

collisions of duplicate ipvlan and macvlan network IDs during createNetwork #2056

Open

eyz mentioned this pull request Jan 24, 2018

Phantom docker network #1743

Closed

corhere added the carry-to-mobymoby label Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cleanup duplicate ipvlan and macvlan network IDs during createNetwork #2055

cleanup duplicate ipvlan and macvlan network IDs during createNetwork #2055

eyz commented Jan 16, 2018 •

edited

Loading

eyz commented Jan 16, 2018

codecov-io commented Jan 16, 2018

fcrisciani commented Jan 23, 2018

eyz commented Jan 23, 2018 •

edited

Loading

fcrisciani commented Jan 23, 2018

eyz commented Feb 9, 2018

rnataraja commented Oct 30, 2018

eyz commented Jan 10, 2019

cleanup duplicate ipvlan and macvlan network IDs during createNetwork #2055

Are you sure you want to change the base?

cleanup duplicate ipvlan and macvlan network IDs during createNetwork #2055

Conversation

eyz commented Jan 16, 2018 • edited Loading

eyz commented Jan 16, 2018

codecov-io commented Jan 16, 2018

Codecov Report

fcrisciani commented Jan 23, 2018

eyz commented Jan 23, 2018 • edited Loading

fcrisciani commented Jan 23, 2018

eyz commented Feb 9, 2018

rnataraja commented Oct 30, 2018

eyz commented Jan 10, 2019

eyz commented Jan 16, 2018 •

edited

Loading

eyz commented Jan 23, 2018 •

edited

Loading