Skip to content

Commit 4451eb9

Browse files
committed
kubevirt/connection-problem-with-pod-bridge.md
1 parent 2650798 commit 4451eb9

File tree

1 file changed

+226
-0
lines changed

1 file changed

+226
-0
lines changed
Lines changed: 226 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,226 @@
1+
# Connection problem with `kubevirt.io/allow-pod-bridge-network-live-migration` after live migration
2+
3+
4+
## HCP Cluster sendling:
5+
6+
```bash
7+
oc get nodes
8+
NAME STATUS ROLES AGE VERSION
9+
sendling-d0c14274-6nbvl Ready worker 11d v1.27.8+4fab27b
10+
sendling-d0c14274-sz7rb Ready worker 11d v1.27.8+4fab27b
11+
```
12+
13+
<details>
14+
<summary>Ping check details node/sendling-d0c14274-6nbvl</summary>
15+
16+
```bash
17+
oc debug node/sendling-d0c14274-6nbvl
18+
Starting pod/sendling-d0c14274-6nbvl-debug ...
19+
To use host binaries, run `chroot /host`
20+
Pod IP: 10.128.8.133
21+
If you don't see a command prompt, try pressing enter.
22+
sh-4.4# ping www.google.de
23+
PING www.google.de (172.253.62.94) 56(84) bytes of data.
24+
64 bytes from bc-in-f94.1e100.net (172.253.62.94): icmp_seq=1 ttl=99 time=112 ms
25+
64 bytes from bc-in-f94.1e100.net (172.253.62.94): icmp_seq=2 ttl=99 time=98.3 ms
26+
^C
27+
--- www.google.de ping statistics ---
28+
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
29+
rtt min/avg/max/mdev = 98.310/105.047/111.785/6.745 ms
30+
sh-4.4# exit
31+
exit
32+
33+
Removing debug pod ...
34+
```
35+
</details>
36+
37+
<details>
38+
<summary>Ping check details node/sendling-d0c14274-sz7rb</summary>
39+
40+
```bash
41+
$ oc debug node/sendling-d0c14274-sz7rb
42+
Starting pod/sendling-d0c14274-sz7rb-debug ...
43+
To use host binaries, run `chroot /host`
44+
Pod IP: 10.131.9.28
45+
If you don't see a command prompt, try pressing enter.
46+
sh-4.4# ping www.google.de
47+
PING www.google.de (172.253.62.94) 56(84) bytes of data.
48+
```
49+
</details>
50+
51+
52+
* Node sendling-d0c14274-**6nbvl** - Ping google ✅
53+
* Node sendling-d0c14274-**sz7rb** - Ping google ❌
54+
55+
56+
```bash
57+
58+
$ oc get pods -l kubevirt.io=virt-launcher -o wide -n rbohne-hcp-sendling
59+
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
60+
virt-launcher-sendling-d0c14274-6nbvl-pb6zd 1/1 Running 0 6d2h 10.128.8.133 inf8 <none> 1/1
61+
virt-launcher-sendling-d0c14274-sz7rb-cw5vj 1/1 Running 0 3d20h 10.131.9.28 ucs-blade-server-1 <none> 1/1
62+
virt-launcher-sendling-d0c14274-sz7rb-mbmv8 0/1 Completed 0 3d20h 10.131.9.28 ucs-blade-server-3 <none> 1/1
63+
virt-launcher-sendling-d0c14274-sz7rb-nb25r 0/1 Completed 0 6d2h 10.131.9.28 ucs-blade-server-1 <none> 1/1
64+
$
65+
```
66+
67+
### Checkout node routing:
68+
69+
Host subnets:
70+
``` bash
71+
$ oc get nodes -o custom-columns="NODE:.metadata.name,host-cidr:.metadata.annotations.k8s\.ovn\.org/host-cidrs,node-subnets:.
72+
metadata.annotations.k8s\.ovn\.org/node-subnets"
73+
NODE host-cidr node-subnets
74+
inf4 ["10.32.96.4/20"] {"default":["10.128.0.0/21"]}
75+
inf44 ["10.32.96.44/20"] {"default":["10.128.8.0/21"]}
76+
inf5 ["10.32.96.5/20","10.32.98.1/32","10.32.98.2/32"] {"default":["10.130.0.0/21"]}
77+
inf6 ["10.32.96.6/20"] {"default":["10.129.0.0/21"]}
78+
inf7 ["10.32.96.7/20"] {"default":["10.128.16.0/21"]}
79+
inf8 ["10.32.96.8/20"] {"default":["10.131.8.0/21"]}
80+
ucs-blade-server-1 ["10.32.96.101/20"] {"default":["10.131.0.0/21"]}
81+
ucs-blade-server-3 ["10.32.96.103/20"] {"default":["10.130.8.0/21"]}
82+
...
83+
84+
$ oc get pods -n openshift-ovn-kubernetes -o wide -l app=ovnkube-node
85+
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
86+
...
87+
ovnkube-node-9xt5n 8/8 Running 8 2d7h 10.32.96.101 ucs-blade-server-1 <none> <none>
88+
ovnkube-node-hhsx5 8/8 Running 8 2d7h 10.32.96.8 inf8 <none> <none>
89+
ovnkube-node-qx9bh 8/8 Running 9 (2d6h ago) 2d7h 10.32.96.103 ucs-blade-server-3 <none> <none>
90+
...
91+
92+
$ oc exec -n openshift-ovn-kubernetes -c ovn-controller ovnkube-node-9xt5n -- ovn-nbctl lr-route-list ovn_cluster_router
93+
IPv4 Routes
94+
Route Table <main>:
95+
10.128.8.133 100.88.0.9 dst-ip
96+
10.129.8.107 10.129.8.107 dst-ip rtos-ucs-blade-server-1 ecmp
97+
10.129.8.107 100.88.0.8 dst-ip ecmp
98+
10.130.10.29 10.130.10.29 dst-ip rtos-ucs-blade-server-1
99+
10.131.8.41 10.131.8.41 dst-ip rtos-ucs-blade-server-1
100+
10.131.9.28 10.131.9.28 dst-ip rtos-ucs-blade-server-1 ecmp
101+
10.131.9.28 100.88.0.8 dst-ip ecmp
102+
10.131.9.44 10.131.9.44 dst-ip rtos-ucs-blade-server-1
103+
100.64.0.2 100.88.0.2 dst-ip
104+
100.64.0.3 100.88.0.3 dst-ip
105+
100.64.0.4 100.88.0.4 dst-ip
106+
100.64.0.5 100.64.0.5 dst-ip
107+
100.64.0.6 100.88.0.6 dst-ip
108+
100.64.0.8 100.88.0.8 dst-ip
109+
100.64.0.9 100.88.0.9 dst-ip
110+
100.64.0.10 100.88.0.10 dst-ip
111+
10.128.0.0/21 100.88.0.2 dst-ip
112+
10.128.8.0/21 100.88.0.6 dst-ip
113+
10.128.16.0/21 100.88.0.10 dst-ip
114+
10.129.0.0/21 100.88.0.3 dst-ip
115+
10.130.0.0/21 100.88.0.4 dst-ip
116+
10.130.8.0/21 100.88.0.8 dst-ip
117+
10.131.8.0/21 100.88.0.9 dst-ip
118+
10.128.0.0/14 100.64.0.5 src-ip
119+
120+
$ oc exec -n openshift-ovn-kubernetes -c ovn-controller ovnkube-node-hhsx5 -- ovn-nbctl lr-route-list ovn_cluster_router
121+
IPv4 Routes
122+
Route Table <main>:
123+
10.128.8.133 10.128.8.133 dst-ip rtos-inf8
124+
10.129.8.107 100.88.0.5 dst-ip ecmp
125+
10.129.8.107 100.88.0.8 dst-ip ecmp
126+
10.130.10.29 100.88.0.5 dst-ip
127+
10.131.8.41 100.88.0.5 dst-ip
128+
10.131.9.28 100.88.0.5 dst-ip ecmp
129+
10.131.9.28 100.88.0.8 dst-ip ecmp
130+
10.131.9.44 100.88.0.5 dst-ip
131+
100.64.0.2 100.88.0.2 dst-ip
132+
100.64.0.3 100.88.0.3 dst-ip
133+
100.64.0.4 100.88.0.4 dst-ip
134+
100.64.0.5 100.88.0.5 dst-ip
135+
100.64.0.6 100.88.0.6 dst-ip
136+
100.64.0.8 100.88.0.8 dst-ip
137+
100.64.0.9 100.64.0.9 dst-ip
138+
100.64.0.10 100.88.0.10 dst-ip
139+
10.128.0.0/21 100.88.0.2 dst-ip
140+
10.128.8.0/21 100.88.0.6 dst-ip
141+
10.128.16.0/21 100.88.0.10 dst-ip
142+
10.129.0.0/21 100.88.0.3 dst-ip
143+
10.130.0.0/21 100.88.0.4 dst-ip
144+
10.130.8.0/21 100.88.0.8 dst-ip
145+
10.131.0.0/21 100.88.0.5 dst-ip
146+
10.128.0.0/14 100.64.0.9 src-ip
147+
$
148+
149+
$ oc exec -n openshift-ovn-kubernetes -c ovn-controller ovnkube-node-qx9bh -- ovn-nbctl lr-route-list ovn_cluster_router
150+
IPv4 Routes
151+
Route Table <main>:
152+
10.128.8.133 100.88.0.9 dst-ip
153+
10.129.8.107 100.88.0.5 dst-ip
154+
10.130.10.29 100.88.0.5 dst-ip
155+
10.131.8.41 100.88.0.5 dst-ip
156+
10.131.9.28 100.88.0.5 dst-ip
157+
10.131.9.44 100.88.0.5 dst-ip
158+
100.64.0.2 100.88.0.2 dst-ip
159+
100.64.0.3 100.88.0.3 dst-ip
160+
100.64.0.4 100.88.0.4 dst-ip
161+
100.64.0.5 100.88.0.5 dst-ip
162+
100.64.0.6 100.88.0.6 dst-ip
163+
100.64.0.8 100.64.0.8 dst-ip
164+
100.64.0.9 100.88.0.9 dst-ip
165+
100.64.0.10 100.88.0.10 dst-ip
166+
10.128.0.0/21 100.88.0.2 dst-ip
167+
10.128.8.0/21 100.88.0.6 dst-ip
168+
10.128.16.0/21 100.88.0.10 dst-ip
169+
10.129.0.0/21 100.88.0.3 dst-ip
170+
10.130.0.0/21 100.88.0.4 dst-ip
171+
10.131.0.0/21 100.88.0.5 dst-ip
172+
10.131.8.0/21 100.88.0.9 dst-ip
173+
10.128.0.0/14 100.64.0.8 src-ip
174+
175+
176+
```
177+
178+
179+
### Run a nginx build a broken node (sz7rb)
180+
181+
```bash
182+
$ oc adm cordon node/sendling-d0c14274-6nbvl
183+
node/sendling-d0c14274-6nbvl already cordoned
184+
$ oc get nodes
185+
NAME STATUS ROLES AGE VERSION
186+
sendling-d0c14274-6nbvl Ready,SchedulingDisabled worker 12d v1.27.8+4fab27b
187+
sendling-d0c14274-sz7rb Ready worker 12d v1.27.8+4fab27b
188+
$ oc project demo
189+
Now using project "demo" on server "https://10.32.98.158:6443".
190+
$ oc get bc
191+
NAME TYPE FROM LATEST
192+
nginx-sample Source Git 1
193+
$ oc start-build --follow nginx-sample
194+
build.build.openshift.io/nginx-sample-2 started
195+
Failed to stream the build logs - to view the logs, run oc logs build/nginx-sample-2
196+
Error: unable to stream the build logs; caused by: unable to wait for build nginx-sample-2 to run: timed out waiting for the condition
197+
$ oc get pods -o wide
198+
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
199+
nginx-sample-2-build 0/1 Init:0/2 0 51s <none> sendling-d0c14274-sz7rb <none> <none>
200+
nginx-sample-69d8c49d7d-6n42t 0/1 ImagePullBackOff 0 5m5s 10.135.0.95 sendling-d0c14274-6nbvl <none> <none>
201+
202+
$ oc get pods -o wide
203+
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
204+
nginx-sample-2-build 0/1 Init:ImagePullBackOff 0 117s 10.134.0.3 sendling-d0c14274-sz7rb <none> <none>
205+
nginx-sample-69d8c49d7d-6n42t 0/1 ImagePullBackOff 0 6m11s 10.135.0.95 sendling-d0c14274-6nbvl <none> <none>
206+
207+
208+
```
209+
210+
**Error:**
211+
```
212+
Failed to pull image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d2401f5d873de313176e23a61c7f4d5638e3683abc4cf20b98b82f11db73a9c0": rpc error: code = Unknown desc = copying system image from manifest list: parsing image configuration: Get "https://cdn02.quay.io/sha256/6f/6f6b4ec38c832e38d3e6d08187ed522c0413fabcbfe1f695b875f59ea00dc154?username=openshift-release-dev%2Bocm_access_fd3b76d3f252448fa62fb5587c6d22db&namespace=openshift-release-dev&Expires=1704299495&Signature=WyjvBndXGi5LtWAgX14ub0034mnNvueEqLb~~FZW9t4pcDMBCDP6kpoD7ld76q-ZzaFrerQYeWFm1NGnbyAkVnoo8zlaybwllfBrYOxxJ3JbY-YJyOZS105LVbjJaKaJhCGHldyBFyDxNiSkojI9U8OUJECW7MbqXgmqhvWOFFtOTbaeDLTnxJl~iTfhRUO4gAwYhge0uGzWiIwRogD6rPVtvsr7lkMVOYKqWp-BQKM28SPlvbYVWejSGmtPO4inQbGStmGraBGGts8x9d731Ikyq1Xc5knL4Mf8jeUkHeK~ChtGw5~JlutuPhu3v4wRLe-cznc5x8g8WEMG7dUlQA__&Key-Pair-Id=APKAJ67PQLWGCSP66DGA": dial tcp: lookup cdn02.quay.io on 172.30.0.10:53: read udp 10.131.9.28:37668->172.30.0.10:53: i/o timeout
213+
```
214+
215+
After some time:
216+
217+
```bash
218+
219+
$ oc get pods -o wide
220+
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
221+
nginx-sample-2-build 0/1 Completed 0 4m42s 10.134.0.3 sendling-d0c14274-sz7rb <none> <none>
222+
nginx-sample-d54f74d9c-2trhc 1/1 Running 0 95s 10.134.0.6 sendling-d0c14274-sz7rb <none> <none>
223+
$
224+
```
225+
226+
![I've node clue](https://media.giphy.com/media/zDIgL7AHiVHo6lOanB/giphy.gif)

0 commit comments

Comments
 (0)