Hi all. I have compiled LXC/LXD from source on a Rocks64 (similar board to a RPi) and configured a clustered LXD between 2 nodes. I’ve created a FAN network (bridged) and containers get IP normally. The problem is that two containers, each scheduled in one node, cannot resolve nor find each other (no ping, no SSH,…), neither through the DNS (.lxd) nor through the IP address. I am on Kernel 5.6, Ubuntu 18.04 with AppArmor and nor squashfs disabled (compilation issues).
You can see additional info below:
lxc list
+------+---------+----------------------+------+-----------+-----------+----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION |
+------+---------+----------------------+------+-----------+-----------+----------+
| c3 | RUNNING | 240.143.0.107 (eth0) | | CONTAINER | 0 | pc3 |
+------+---------+----------------------+------+-----------+-----------+----------+
| c5 | RUNNING | 240.145.0.26 (eth0) | | CONTAINER | 0 | pc5 |
+------+---------+----------------------+------+-----------+-----------+----------+
Both are privileged containers (just to avoid any problem w/ permissions or anything…) . So if I am inside c5 and ping/SSH c3, I get Name resolution error. And if I tried with an IP, I get network unreachable:
root@c3:~# ping c5.lxd
ping: c5.lxd: Name or service not known
root@c3:~# ping 240.145.0.26
PING 240.145.0.26 (240.145.0.26) 56(84) bytes of data.
From 240.143.0.107 icmp_seq=1 Destination Host Unreachable
I tried tcpdumping from c5’s host, ping c3.lxd, and I got this:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lxdfan0, link-type EN10MB (Ethernet), capture size 262144 bytes
23:20:47.900475 IP 240.145.0.26.59218 > 240.145.0.1.53: 459+ [1au] A? c3.lxd. (35)
23:20:47.901639 IP 240.145.0.26.44235 > 240.145.0.1.53: 38912+ [1au] AAAA? c3.lxd. (35)
23:20:47.901780 ARP, Request who-has 240.143.0.1 tell 240.145.0.1, length 28
23:20:48.920457 ARP, Request who-has 240.143.0.1 tell 240.145.0.1, length 28
23:20:49.540627 IP 240.145.0.26.59218 > 240.145.0.1.53: 459+ [1au] A? c3.lxd. (35)
23:20:49.541026 IP 240.145.0.26.44235 > 240.145.0.1.53: 38912+ [1au] AAAA? c3.lxd. (35)
23:20:49.902708 IP 240.145.0.1.53 > 240.145.0.26.59218: 459 NXDomain 0/0/0 (24)
23:20:49.903460 IP 240.145.0.26.59218 > 240.145.0.1.53: 4159+ A? c3.lxd. (24)
23:20:49.904745 IP 240.145.0.1.53 > 240.145.0.26.44235: 38912 NXDomain 0/0/0 (24)
23:20:49.905272 IP 240.145.0.26.44235 > 240.145.0.1.53: 40536+ AAAA? c3.lxd. (24)
23:20:49.944425 ARP, Request who-has 240.143.0.1 tell 240.145.0.1, length 28
23:20:51.540621 IP 240.145.0.26.44235 > 240.145.0.1.53: 40536+ AAAA? c3.lxd. (24)
23:20:51.542151 ARP, Request who-has 240.143.0.1 tell 240.145.0.1, length 28
23:20:51.906954 IP 240.145.0.1.53 > 240.145.0.26.59218: 4159 NXDomain 0/0/0 (24)
23:20:51.908688 IP 240.145.0.1.53 > 240.145.0.26.44235: 40536 NXDomain 0/0/0 (24)
23:20:51.910896 IP 240.145.0.26.40212 > 240.145.0.1.53: 35091+ A? c3.lxd.lxd. (28)
23:20:51.912003 IP 240.145.0.26.37653 > 240.145.0.1.53: 19981+ AAAA? c3.lxd.lxd. (28)
23:20:52.568456 ARP, Request who-has 240.143.0.1 tell 240.145.0.1, length 28
23:20:52.984448 ARP, Request who-has 240.145.0.1 tell 240.145.0.26, length 28
23:20:52.984640 ARP, Reply 240.145.0.1 is-at 00:16:3e:85:b2:7b, length 28
23:20:53.592453 ARP, Request who-has 240.143.0.1 tell 240.145.0.1, length 28
23:20:53.914226 IP 240.145.0.1.53 > 240.145.0.26.40212: 35091 NXDomain 0/0/0 (28)
23:20:53.914803 IP 240.145.0.1.53 > 240.145.0.26.37653: 19981 NXDomain 0/0/0 (28)
23:20:55.032449 ARP, Request who-has 240.145.0.26 tell 240.145.0.1, length 28
23:20:55.032623 ARP, Reply 240.145.0.26 is-at 00:16:3e:d3:e5:02, length 28
And when I try pinging c3’s IP directly, I get:
21:38:58.721910 ARP, Request who-has 240.143.0.107 tell 240.145.0.26, length 28
21:38:59.736454 ARP, Request who-has 240.143.0.107 tell 240.145.0.26, length 28
21:39:00.760477 ARP, Request who-has 240.143.0.107 tell 240.145.0.26, length 28
21:39:01.785296 ARP, Request who-has 240.143.0.107 tell 240.145.0.26, length 28
21:39:02.808478 ARP, Request who-has 240.143.0.107 tell 240.145.0.26, length 28
21:39:03.832478 ARP, Request who-has 240.143.0.107 tell 240.145.0.26, length 28
21:39:04.856590 ARP, Request who-has 240.143.0.107 tell 240.145.0.26, length 28
21:39:05.880452 ARP, Request who-has 240.143.0.107 tell 240.145.0.26, length 28
21:39:06.904451 ARP, Request who-has 240.143.0.107 tell 240.145.0.26, length 28
...
dnsmasq is running on both nodes:
Host 1:
lxd 4667 0.0 0.0 8380 3380 ? Ss 19:05 0:00 dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --pid-file= --no-ping --interface=lxdfan0 --listen-address=240.143.0.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/lib/lxd/networks/lxdfan0/dnsmasq.leases --dhcp-hostsfile=/var/lib/lxd/networks/lxdfan0/dnsmasq.hosts --dhcp-range 240.143.0.2,240.143.0.254,1h -s lxd --interface-name _gateway.lxd,lxdfan0 -S /lxd/240.143.0.1#1053 --rev-server=240.0.0.0/8,240.143.0.1#1053 --conf-file=/var/lib/lxd/networks/lxdfan0/dnsmasq.raw -u lxd -g lxd
Host 2:
lxd 4158 6.0 0.0 8380 3504 ? Ss 19:33 7:39 dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --pid-file= --no-ping --interface=lxdfan0 --listen-address=240.145.0.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/lib/lxd/networks/lxdfan0/dnsmasq.leases --dhcp-hostsfile=/var/lib/lxd/networks/lxdfan0/dnsmasq.hosts --dhcp-range 240.145.0.2,240.145.0.254,1h -s lxd --interface-name _gateway.lxd,lxdfan0 -S /lxd/240.145.0.1#1053 --rev-server=240.0.0.0/8,240.145.0.1#1053 --conf-file=/var/lib/lxd/networks/lxdfan0/dnsmasq.raw -u lxd -g lxd
lxc network list
+---------+----------+---------+------+------+-------------+---------+---------+
| NAME | TYPE | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY | STATE |
+---------+----------+---------+------+------+-------------+---------+---------+
| eth0 | physical | NO | | | | 0 | |
+---------+----------+---------+------+------+-------------+---------+---------+
| lxdbr0 | bridge | NO | | | | 0 | |
+---------+----------+---------+------+------+-------------+---------+---------+
| lxdfan0 | bridge | YES | | | | 3 | CREATED |
+---------+----------+---------+------+------+-------------+---------+---------+
Network configuration on Host 1:
ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.143 netmask 255.255.255.0 broadcast 192.168.1.255
inet6 fe80::849:7dff:fe6b:179 prefixlen 64 scopeid 0x20<link>
inet6 fd19:3eb0:a73d:0:849:7dff:fe6b:179 prefixlen 64 scopeid 0x0<global>
ether 0a:49:7d:6b:01:79 txqueuelen 1000 (Ethernet)
RX packets 898897 bytes 832813904 (832.8 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 640852 bytes 415472680 (415.4 MB)
TX errors 2 dropped 0 overruns 2 carrier 0 collisions 0
device interrupt 35
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 1243783 bytes 96227732 (96.2 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1243783 bytes 96227732 (96.2 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lxdbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 4a:16:d8:f4:b8:62 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lxdfan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 240.143.0.1 netmask 255.0.0.0 broadcast 0.0.0.0
inet6 fe80::216:3eff:fed1:a409 prefixlen 64 scopeid 0x20<link>
ether 00:16:3e:d1:a4:09 txqueuelen 1000 (Ethernet)
RX packets 275 bytes 26808 (26.8 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 206 bytes 61724 (61.7 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lxdfan0-fan: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet6 fe80::acd2:e5ff:fefd:9830 prefixlen 64 scopeid 0x20<link>
ether ae:d2:e5:fd:98:30 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 150 overruns 0 carrier 0 collisions 0
lxdfan0-mtu: flags=195<UP,BROADCAST,RUNNING,NOARP> mtu 1450
inet6 fe80::4472:d2ff:fef0:eb73 prefixlen 64 scopeid 0x20<link>
ether 46:72:d2:f0:eb:73 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 146 bytes 10296 (10.2 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
vethbbdbe0d8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
ether 66:b9:49:48:bc:02 txqueuelen 1000 (Ethernet)
RX packets 56 bytes 5976 (5.9 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 48 bytes 12038 (12.0 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lxc network show lxdfan0
config:
bridge.mode: fan
fan.underlay_subnet: 192.168.1.0/24
ipv4.nat: "true"
description: ""
name: lxdfan0
type: bridge
used_by:
- /1.0/instances/c3
- /1.0/instances/c5
- /1.0/profiles/default
managed: true
status: Created
locations:
- pc3
- pc5
lxc profile show default
config: {}
description: Default LXD profile
devices:
eth0:
name: eth0
network: lxdfan0
type: nic
root:
path: /
pool: local
type: disk
name: default
used_by:
- /1.0/instances/c3
- /1.0/instances/c5
I’ve also configured lxdfan0 in each host to use the DNS from dnsmasq (I adjusted the IP address to the lxdfan0). I’m having a hard time debugging this, but no luck so far. Anyone would know what this can be? Thank you!