-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix bug: In RefreshContainers, containers created after ListContainers are lost #2190
Conversation
…s are lost Signed-off-by: menghui.chen <[email protected]>
Signed-off-by: menghui.chen <[email protected]>
Jenkins build Swarm-PRs (engine master) failed because of timeout, but it seems have nothing to do with my commit. |
How could I re-trigger the check process? |
I forced a rebuild. |
@dongluochen I think this issue is severe especially when docker is in heavy load, which makes swarm fail to create and run containers. Is there anyone following? |
I've seen similar results @jimmycmh when running swarmbench against the cluster. See the result panic here: aluzzardi/swarm-bench#4 This appears to be in alignment with your findings docker 1.11.1 |
@jimmycmh Swarm manager relies on events to update them.
|
@chuckbutler The error is returned in your test as following. It's not a problem in swarm. It's on the Engine side where
|
@dongluochen i assumed these were related based on the output. This belongs in the docker-engine project? (sorry about adding noise to an otherwise normal PR) I'm happy to migrate there. |
@chuckbutler I don't know history of this problem. Swarm reports the error explicitly in change e1daa7c. You may check with Docker Engine. |
@dongluochen "Container created but refresh didn't report it back" this error is not docker engine's fault but swarm's, and this pull is to fix it. |
@jimmycmh Sorry for my delay. Didn't get time to work on it. I think your change is better than current implementation. But it's not complete. Let's say container The real problem is between |
@dongluochen Totally agree. Since we cannot lock swarm for the whole update process, there will be inconsistency. My pl fixes the inconsisteny with new containers. Glad to see you have better plan. |
Has there been any change on this? I'm still seeing the container refresh bug when I attempt to benchmark a swarm cluster with swarm bench. |
@chuckbutler You can merge my pl. I have been using it in production for a long time, and it's working well. Sure this solution is not perfect, but it solved most of the problem. |
@jimmycmh @nishanttotla @allencloud Do you think this approach better than the current approach? If yes, we can merge it. |
Any chances to merge this PR? |
@jimmycmh can you squash your commits and rebase the PR? We'll re-evaluate this for the next release. |
Please sign your commits following these rules: $ git clone -b "master" [email protected]:jimmycmh/swarm.git somewhere
$ cd somewhere
$ git rebase -i HEAD~842354559712
editor opens
change each 'pick' to 'edit'
save the file and quit
$ git commit --amend -s --no-edit
$ git rebase --continue # and repeat the amend for each commit
$ git push -f Amending updates the existing PR. You DO NOT need to open a new one. |
Signed-off-by: jimmycmh <[email protected]>
some problem here. @jimmycmh care to follow up? |
In function RefreshContainers, containers created after ListContainers are lost, which will make swarm return ""Container created but refresh didn't report it back" error.