Network problem - suddenly no traffic between container(s) and host (as result no DHCP)

And those addresses in the containers are added manually by yourself or via DHCP?

I’m trying to understand the problem clearly as earlier you mentioned there were no IPs, and then later you mentioned that you can ping between containers (the two scenarios being mutually exclusive).

Manually of course (to be able to try to send packets/ping/etc).
Here is a fresh container:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
37: eth0@if38: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:15:50:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::216:3eff:fe15:50c6/64 scope link 
       valid_lft forever preferred_lft forever

And ip r s is empty

I see thanks, it wasn’t clear to me.

Can you try running sudo tcpdump -i lxdbr0 -nn on the host and then try pinging the lxdbr0 IP from one of the containers that has the manually added IPs, and see what you get on the host-side from tcpdump?

Please can I see the output of sudo iptables-save as well.

FROM HOST ( ping 192.168.250.1 running inside both containers )

15:32:19.673832 ARP, Request who-has 192.168.250.1 tell 192.168.250.120, length 28
15:32:20.692403 ARP, Request who-has 192.168.250.1 tell 192.168.250.130, length 28

( a lot of ARP request , no response ! )

Same visible inside container(s) on eth0

I tried to add static ARP entry inside container :
> arp -s 192.168.250.1 00:16:3e:35:ad:d5 # this is MAC address of lxdbr0

After that there are ICMP requests (no response) in tcpdump (on host and inside container):

13:36:43.130122 IP 192.168.250.120 > 192.168.250.1: ICMP echo request, id 19546, seq 30, length 64
13:36:44.154127 IP 192.168.250.120 > 192.168.250.1: ICMP echo request, id 19546, seq 31, length 64

And here is IPTABLES:

iptables-save 
# Generated by iptables-save v1.8.4 on Thu May 20 15:40:21 2021
*raw
:PREROUTING ACCEPT [714102:674799576]
:OUTPUT ACCEPT [249872:21154928]
COMMIT
# Completed on Thu May 20 15:40:21 2021
# Generated by iptables-save v1.8.4 on Thu May 20 15:40:21 2021
*mangle
:PREROUTING ACCEPT [4593:2510428]
:INPUT ACCEPT [4593:2510428]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [4391:460657]
:POSTROUTING ACCEPT [4410:462492]
:LIBVIRT_PRT - [0:0]
-A POSTROUTING -j LIBVIRT_PRT
-A POSTROUTING -o lxdbr0 -p udp -m udp --dport 68 -m comment --comment "generated for LXD network lxdbr0" -j CHECKSUM --checksum-fill
COMMIT
# Completed on Thu May 20 15:40:21 2021
# Generated by iptables-save v1.8.4 on Thu May 20 15:40:21 2021
*nat
:PREROUTING ACCEPT [179:80974]
:INPUT ACCEPT [179:80974]
:OUTPUT ACCEPT [568:47177]
:POSTROUTING ACCEPT [546:42982]
:LIBVIRT_PRT - [0:0]
-A POSTROUTING -j LIBVIRT_PRT
-A POSTROUTING -s 192.168.250.0/24 ! -d 192.168.250.0/24 -m comment --comment "generated for LXD network lxdbr0" -j MASQUERADE
COMMIT
# Completed on Thu May 20 15:40:21 2021
# Generated by iptables-save v1.8.4 on Thu May 20 15:40:21 2021
*filter
:INPUT ACCEPT [4593:2510428]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [4391:460657]
:LIBVIRT_FWI - [0:0]
:LIBVIRT_FWO - [0:0]
:LIBVIRT_FWX - [0:0]
:LIBVIRT_INP - [0:0]
:LIBVIRT_OUT - [0:0]
-A INPUT -j LIBVIRT_INP
-A INPUT -i lxdbr0 -p tcp -m tcp --dport 53 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A INPUT -i lxdbr0 -p udp -m udp --dport 53 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A INPUT -i lxdbr0 -p udp -m udp --dport 67 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A FORWARD -j LIBVIRT_FWX
-A FORWARD -j LIBVIRT_FWI
-A FORWARD -j LIBVIRT_FWO
-A FORWARD -o lxdbr0 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A FORWARD -i lxdbr0 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A OUTPUT -j LIBVIRT_OUT
-A OUTPUT -o lxdbr0 -p tcp -m tcp --sport 53 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A OUTPUT -o lxdbr0 -p udp -m udp --sport 53 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A OUTPUT -o lxdbr0 -p udp -m udp --sport 67 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
COMMIT
# Completed on Thu May 20 15:40:21 2021

Are you running lldpd on your host by any chance? We saw something similar in this thread recently.

No lldpd here (or any similar stuff).
From network perspective it is standard vanilla ubuntu 20.04

Can I get a login to the box?

(I’v sent info in PM)

Ah sorry I missed you, was working on something else.

I’m was thinking more about a remote console (like SSH or some people use TeamViewer).

Can be teamviewer ! just PM me if you will be online

Still unresolved. Any other ideas how to investigate / fix ? (except lxd restart … )

I just spotted your problem, your external Ethernet device’s subnet is too large and is overlapping with your lxdbr0 subnet.

Hrm, actually it isn’t, sorry, but I’d be interested to know if you changed your lxdbr0 subnet if the problem went away.

I’ve seen issues in the past where the routing table causes response packets from lxdbr0 to be sent out of a different interface.

Can you show output of:

 bridge link show

I is magically (as always - after lxd service restart, forced by OS ) back to normal :frowning: :frowning:
Waiting for this to happen again (and it will)

(posting output of this command in ‘working’ state anyway - but AFAIR it was the same)

# bridge link show
46: veth7d9131f5@if45: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master lxdbr0 state forwarding priority 32 cost 2 
48: vethe9f19c50@if47: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master lxdbr0 state forwarding priority 32 cost 2 

Next time it happens also check the journalctl output to see if anything has happened that may be linked to it (interfaces appearing or disappearing for instance).

Managed to reproduce that. This is happening ALWAYS and ONLY after snap updates the lxd package.

Now investigating to identify it there is any log trace or way to fix that.

Does systemctl reload snap.lxd.daemon trigger it?
That’s pretty much all that happens during a refresh.

Hi, thank you very much, I just had this issue today, wasted a couple of hours but finally got it working thanks to you.

@stgraber : systemctl reload ... did not help, but out of desperation I tried systemctl restart ... and it did work! Thank you too for all your work on lxd, I use it every day for all kinds of things.

snap deserves to die in a fire, has caused me many wasted hours since I met it by surprise years ago. I would remove it and reinstall lxd without it if that’s possible, but I’m afraid of it destroying my containers in this machine so I’ll just try to remember next time I set up a host.

This is most likely a conflict between the ordering of existing firewall rules on the system and the ones that LXD set up on restart. This commonly happens if Docker is installed on the same host.

See How to configure your firewall - LXD documentation