Lxdbr0 bridge pingable from the different subnet but the VM not

nordex1 · June 13, 2021, 11:24am

I have a setup with two subnets on two separate physical machines.

192.168.1.0/24 : LXD host with the VM

root@brix:/# lxc list
+-------------+---------+------------------------+------+-----------------+-----------+
|    NAME     |  STATE  |          IPV4          | IPV6 |      TYPE       | SNAPSHOTS |
+-------------+---------+------------------------+------+-----------------+-----------+
| kube-master | RUNNING | 192.168.1.200 (enp5s0) |      | VIRTUAL-MACHINE | 0         |
+-------------+---------+------------------------+------+-----------------+-----------+


root@brix:/# lxc network show lxdbr0
config:
  ipv4.address: 192.168.1.199/28
  ipv4.nat: "true"
  ipv6.address: none
  ipv6.nat: "true"
description: ""
name: lxdbr0
type: bridge
used_by:
- /1.0/instances/kube-master
managed: true
status: Created
locations:
- none
root@brix:/#

The VM is able to access the Internet and to ping all the 192.168.2.0/24 subnet, bridge, dns,host, etc.
The LXD host is also available to do the same thing.

The VM Netplan setup:

GNU nano 4.8                                            /etc/netplan/10-lxc.yaml                                                       
network:
    ethernets:
        enp5s0:
            dhcp4: false
            addresses: [192.168.1.200/24]
            gateway4: 192.168.1.199
            nameservers:
              addresses: [8.8.8.8,8.8.4.4]
    version: 2

192.168.2.0/24 : another physical machine:
Able to ping the LXD host, the lxdbr0 bridge, but not the VM.

Commands, I have used to setup LXD networking:

lxc network create lxdbr0
lxc network set lxdbr0 ipv4.address 192.168.1.199/28
lxc network attach lxdbr0 kube-master enp5s0 enp5s0

What piece of puzzle am I missing here in order to be able to access the VM?

tomp · June 14, 2021, 7:45am

Please can you show output of ip a and ip r from both the LXD hosts and containers involved.

Please also show the full ping command (including the host name where you’re running it from) and its output. Thanks

nordex1 · June 14, 2021, 8:28am

Hi Thomas

LXD Host:

root@brix:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether b4:2e:99:da:49:49 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.41/24 brd 192.168.1.255 scope global enp2s0
       valid_lft forever preferred_lft forever
    inet6 fe80::b62e:99ff:feda:4949/64 scope link
       valid_lft forever preferred_lft forever
3: wlp1s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 80:32:53:f4:a4:52 brd ff:ff:ff:ff:ff:ff
4: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:e3:f9:37 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.199/28 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fee3:f937/64 scope link
       valid_lft forever preferred_lft forever
5: tap6b232481: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master lxdbr0 state UP group default qlen 1000
    link/ether 2e:55:46:e9:1e:b8 brd ff:ff:ff:ff:ff:ff
6: mac9bb4871f@enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 500
    link/ether 00:16:3e:f7:70:45 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::216:3eff:fef7:7045/64 scope link
       valid_lft forever preferred_lft forever


root@brix:/# ip r
default via 192.168.1.40 dev enp2s0 proto static
192.168.1.0/24 dev enp2s0 proto kernel scope link src 192.168.1.41
192.168.1.192/28 dev lxdbr0 proto kernel scope link src 192.168.1.199
root@brix:/#


root@brix:/# ping 192.168.1.199
PING 192.168.1.199 (192.168.1.199) 56(84) bytes of data.
64 bytes from 192.168.1.199: icmp_seq=1 ttl=64 time=0.091 ms
^C
--- 192.168.1.199 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.091/0.091/0.091/0.000 ms
root@brix:/# ping 192.168.1.200
PING 192.168.1.200 (192.168.1.200) 56(84) bytes of data.
64 bytes from 192.168.1.200: icmp_seq=1 ttl=64 time=0.443 ms
64 bytes from 192.168.1.200: icmp_seq=2 ttl=64 time=0.470 ms
^C
--- 192.168.1.200 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1027ms
rtt min/avg/max/mdev = 0.443/0.456/0.470/0.013 ms
root@brix:/#

LXD VM:

root@kube-master:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:16:3e:77:e1:ed brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.200/28 brd 192.168.1.207 scope global enp5s0
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe77:e1ed/64 scope link
       valid_lft forever preferred_lft forever
3: enp6s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:16:3e:f7:70:45 brd ff:ff:ff:ff:ff:ff
root@kube-master:~# ip r
default via 192.168.1.199 dev enp5s0 proto static
192.168.1.192/28 dev enp5s0 proto kernel scope link src 192.168.1.200
root@kube-master:~#


root@kube-master:~# ping 192.168.2.38
PING 192.168.2.38 (192.168.2.38) 56(84) bytes of data.
64 bytes from 192.168.2.38: icmp_seq=1 ttl=62 time=0.461 ms
64 bytes from 192.168.2.38: icmp_seq=2 ttl=62 time=0.608 ms
^C
--- 192.168.2.38 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.461/0.534/0.608/0.073 ms
root@kube-master:~#

Computer in another subnet:

space@space-desktop ~ $ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether bc:5f:f4:fb:16:89 brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.38/24 brd 192.168.2.255 scope global dynamic noprefixroute enp5s0
       valid_lft 75661sec preferred_lft 75661sec
    inet6 fe80::f713:185:9572:aaa4/64 scope link noprefixroute
       valid_lft forever preferred_lft forever


space@space-desktop ~ $ ip r
default via 192.168.2.1 dev enp5s0 proto dhcp metric 100
169.254.0.0/16 dev enp5s0 scope link metric 1000
192.168.2.0/24 dev enp5s0 proto kernel scope link src 192.168.2.38 metric 100


space@space-desktop ~ $ ping 192.168.1.199
PING 192.168.1.199 (192.168.1.199) 56(84) bytes of data.
64 bytes from 192.168.1.199: icmp_seq=1 ttl=63 time=0.601 ms
64 bytes from 192.168.1.199: icmp_seq=2 ttl=63 time=0.516 ms
^C
--- 192.168.1.199 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1021ms
rtt min/avg/max/mdev = 0.516/0.558/0.601/0.042 ms


space@space-desktop ~ $ ping 192.168.1.200
PING 192.168.1.200 (192.168.1.200) 56(84) bytes of data.
From 192.168.2.1 icmp_seq=1 Destination Host Unreachable
From 192.168.2.1 icmp_seq=2 Destination Host Unreachable
From 192.168.2.1 icmp_seq=3 Destination Host Unreachable
^C
--- 192.168.1.200 ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4063ms
pipe 3

From here you can see that only the ping between other subnet (192.168.2.38) and the VM (192.168.1.200) does not work, while the VM can ping the computer in other subnet.

tomp · June 14, 2021, 9:27am

OK thanks for that.

So what you’ve not explained so far is how a ping from a computer in the 192.168.2.0/24 subnet can make it to the LXD host in the 192.168.1.0/24 subnet. But seeing that its working for pings between the computer and the LXD host I’m going to assume you have a router in between those two subnets with routes between both of them.

If this is not the case please can you confirm.

However a few things to point out here:

Your LXD network lxdbr0 has ipv4.nat=true on it, which means outbound packets from the bridge going to the LAN (192.168.1.0/24) will be source NATted to the LXD host’s IP on that network (192.168.1.41) and as such that explains why you can ping 192.168.2.0/24 from your VM, because its in-effect no different from pinging it from the LXD host itself (with the same caveat above that I’m assuming you have a router doing the actual routing between subnets). If you did lxc network set lxdbr0 ipv4.nat=false then I suspect your VM would stop being able to reach the 192.168.2.0/24 subnet (for the same reason that hosts in the 192.168.2.0/24 subnet can ping your VM - see point 3 below).
Your LXD network lxdbr0 has a subnet that is a sub-set of the host’s 192.168.1.0/24 subnet, 192.168.1.192/28 and the bridge interface on the host has an IP address of 192.168.1.199. When a device in the LXD host’s external LAN tries to communicate with an IP in the 192.168.1.0/24 subnet (including the router that is routing packets between subnets) it will send broadcast ARP packets to all devices in the network requesting that one of the nodes identify itself as the target IP address and provide its MAC address for L2 frames. Linux by default will respond to ARP requests for any IP it has locally assigned, even if the request is coming in on an interface that doesn’t have the actual IP assigned (which is the case in this scenario because the ARP request would come in on enp2s0 but be targetted for the IP on lxdbr0) . So when you ping the lxdbr0 address 192.168.1.199 from your PC in 192.168.2.0/24 the router in between the subnets will send an ARP broadcast requests asking for the MAC address owner of the IP 192.168.1.199 and because that IP is assigned locally on the LXD host to the lxdbr0 interface, the LXD host machine will respond with the MAC address from enp2s0 claiming it is responsible for that IP. Packets for 192.168.1.199 then flow to your LXD host and communication between subnets is achieved.
The problem comes when trying to ping the VM on 192.168.1.200 from the PC in subnet 192.168.2.0/24. It works from the LXD host itself because the LXD host has the lxdbr0 interface and the local route created by the interface for 192.168.1.192/28 dev lxdbr0 instructing it to send ARP requests and packets for 192.168.1.192/28 to lxdbr0. However when pinging from the PC in the other subnet, the intermediate router will send an ARP broadcast request for 192.168.1.200 same as before, except this time the LXD host will not respond to the ARP request, because it does not have the IP bound locally (its inside the VM on the ‘other end’ of the lxdbr0 bridge and the host knows nothing about it from a networking perspective). This is why it doesn’t work.

If you are just trying to get your VM instance to join the LXD host’s LAN (and use its DNS and DHCP services) then I suggest you dispense with lxdbr0 entirely and create a new unmanaged bridge called something like br0 and then attach your LXD host’s enp2s0 interface to it and move the LXD host’s IPs to the br0 interface.

Then you can just connect your instances to the external LAN using:

lxc config device add <instance> eth0 nic nictype=bridged parent=br0

See Netplan | Backend-agnostic network configuration in YAML for how to setup br0.

nordex1 · June 19, 2021, 9:05am

From the technical perspective, this is one of the best replies I have ever seen, and I am asking (and answering) questions online since 1999. Big thanks @tomp !

Sorry for the delay but because of the error produced by your command I have questioned my whole setup, and because of couple more miss configurations afterwards done by myself, I have considered the OS installation from the scratch. Luckily, this did not occurred.

root@brix:/# lxc config device add kube-master eth0 nic nictype=bridge parent=br0
Error: Invalid devices: Device validation failed for "eth0": Failed loading device "eth0": Unsupported device type

This error is somehow misleading as it says that there is a problem with eth0, and since eth0 cannot be found anywhere on the system I had a hard time to figure out whats going on.

Two days later, I have seen that the problem is a single letter that is missing from the word “bridge”, so the correct command would be:

lxc config device add <instance> eth0 nic nictype=bridged parent=br0

After setting up the bridge in the Netplan on the LXD host, and applying this command, I was able to ping the VM’s, and the bridge from any subnet in the network.

I have only one question in regards to this command that is confusing me. Why is eth0 mentioned when afterwards there is no eth0 anywhere in the system, both host and VM?

tomp · June 19, 2021, 5:47pm

Thanks! Glad you got it working. Sorry about the typo. I’ve fixed it in the original post now. I think we can probably improve the error message to included tgat, in this case, ‘bridge’ was not recognized as a device type for clarity.

To answer your question about the eth0 device name and its lack of relation to the actual interface name:

The device name can be anything you like, e.g ‘nic1’ or ‘mynetworkcard’. This is just the internal reference LXD uses and doesn’t influence the actual interface name. This is the same for all device types, for example a disk device called ‘root’ vs ‘myrootdisk’ wont change where the disk is mounted in the instance.

I just used ‘eth0’ as a convention/habit to indicate the first NIC in the instance.

For containers the NICs can also have a separate ‘name’ property which does actually set the interface name inside the container. This is auto generated if not specified.

E.g.

lxc config device add c1 mynicdev nic nictype=bridged name=eth3 parent=br0

For VMs because the NIC devices are PCI devices we cannot control the interface name that the guest OS gives it inside, so the name property doesn’t take effect.

However for VMs the device name (not the name property) can subtly influence the interface name the guest gives it because devices are added to the VM pci bus in device name order. This way the first NIC device normally gets called enp5s0 inside the guest, the 2nd one enp6s0 etc.