I have been unable to get the network working.
Using all the guides the external IP address is pingable from inside the host machine, but not from the internet.
I have made sure the container has a MAC address assigned to the failover IP from OVH I am trying to use
It is configured in netplan and in lxd config
If I swap the host machines config to place the IPs on lxdbr0, the external IP to the VM works!
But the host machine has no external IPs.
Nothing I try has worked.
I have also tried putting the network on a different bridge, which works fine. It results in the same issues as putting the network on the main NIC
To replicate:
Launch a 18.04 server
Configure the HOST network for 1 failover IP on the default NIC
Configure the HOST network for 1 main IP on the default NIC
Launch a VM with LXD
Assign a MAC address to the Failover IP you wish to use with OVH
Configure the VM network for 1 failover IP on the default NIC
Try to ping the Failover IP from the internet and get no response
Through a lot of painful debugging, I’ve managed to get it all working… Temporarily.
If I follow the configuration of setting lxdbr0 up in netplan, and then leaving it managed in lxd…
My VMs have internet access and their external IPs work.
The VMs can talk to the host machine.
The host machine has no network
If I log into the host machine (via passing through the vm, or via physical access)
Then netplan apply with the original config, the network on both the host and the VM work as expected.
So;
Start host machine with lxdbr0 configured in netplan
Have netplan bring the network up
LXD starts and modifies the lxdbr0 bridge
Login to the machine, and netplan apply again
Overwriting part of LXD’s modifications to the lxdbr0 but leaving others
Whenever anything changes the network randomly dies again.
I’m unsure how to make this persistent but I’m fairly certain it has to do with configuration in LXD
I have roughly the same netplan config as you and its working very well for me, though I add the following to my bridge:
parameters:
forward-delay: 0
stp: false
I didn’t want spanning tree running on it so that was clear to me, I don’t recall why I set forward-delay, it was likely just a setting I saw somewhere and it works so I don’t touch it.
One question I would have is it LXD thinks its supposed to manage this bridge? If you have it in netplan I think you want it unmanaged and just let the system do it. What does: lxc network show lxdbr0
say?
When I configure lxdbr0 to be unmanaged by LXD then what happens is the network just doesn’t work at all for the VMs, it works as expected for the host.
Obviously when it’s managed, LXD deletes all my address records and continues on merrily. Making the VMs work, but the host not.
Weird, on all 4 of my servers I leave it unmanaged. Some use DHCP others use static addresses in the containers and they all can get out. Now we don’t have the number of IPs you do with failovers, at least not on those hosts so that I guess could be part of it.
How are the container networks defined? Via netplan as well or something else?
Usually the machines can get out, that’s not a problem. The issue is connections coming in. The external failover I am assigning XXX.42 can’t be pinged from outside the machine.
I’ve tried simply using cloud init, I’ve tried OVH’s via 0.0.0.0 and the standard netplan config for an IP.
I’m completely out of ideas other than, have you looked at ARP tables outside the machine (and maybe on the host) and tcpdump to see if pings are making it to the host at least or are they not even getting there.
I’m not really sure how to check the ARP Tables outside the machine, but…
~$ tcpdump host 198.50.XXX.42 and port not 22 -n -s 0 -vvv
tcpdump: listening on enp1s0, link-type EN10MB (Ethernet), capture size 262144 bytes
21:28:22.991354 IP (tos 0x0, ttl 119, id 55667, offset 0, flags [none], proto ICMP (1), length 60)
66.46.XXX.34 > 198.50.XXX.42: ICMP echo request, id 62464, seq 55827, length 40
I am getting the pings passed to the host, but they don’t go any further than that.
ARP in the VM:
arp -a
? (66.70.XXX.254) at <incomplete> on eth0
ARP in the Host:
arp -a
? (66.70.XXX.254) at 00:f1:04:06:ff:ff [ether] on lxdbr0
? (66.70.XXX.252) at 18:8b:9d:e6:65:75 [ether] on lxdbr0
? (66.70.XXX.253) at cc:46:d6:64:75:fb [ether] on lxdbr0