Cluster upgraded automatically to 4.4 a few minutes ago, and now all my containers have no IPs

Let me give container a fixed ip… Give me a minute

Great ok, so we’ve narrowed down the issue to something blocking DHCP packets from reaching dnsmasq (or responses returning). The bridge is working and dnsmasq is running.

So so next up lets try this on the host:

ss -ulpn | grep dnsmasq

And also, can you run tcpdump -pvnl -i lxdfan0 port 67 and port 68 on the host and then stop/start one of the affected containers. This will show if DHCP requests are making it from the container to the host.

Can you run the full command please:

 tcpdump -pvnl -i lxdfan0 port 67 and port 68

Oh maybe your copy/paste just cut a bit off.

Can you show output of iptables-save please.

I’d expect a different output than that, have you truncated it at all?

Only a few bad ip denys on top

Turning off the UFW seems to solve problem, which is weird since this has been working forever. Only affect after 4.4 What ports should dnsmsq would be using?

Thanks for your help looking at problem. It seems to be related to UFW blocking DNSMASQUE, I will continue to figure what is wrong. It is interesting problem started after 4.4 upgrade across all servers.

That’s odd indeed as we’ve not changed anything in the firewalling code or related to network ports in 4.4.

It could be a race of some kind though where the LXD rules were initially added ahead of the ufw rules. Restarting LXD during the upgrade caused the rules to now be after those managed by UFW causing the issue.

I’m glad it’s not an issue in the piece of work we did do this cycle though (securing dnsmasq by using apparmor for it).

Who knows sometimes… Besides port 53… what other port should be open for dnsmasq or apparmor?

Can you show the output of lxc info, it would be interesting to see which firewall LXD detected (and used) when it started up, as I would expect to see some firewall rules that LXD adds to explicitly allow DHCP and DNS.

Perhaps they are being added to Nftables rather than Iptables or perhaps UFW has replaced them with its own ruleset.

You may also get some benefit from some of the approaches to managing LXD snap upgrade times described here Managing the LXD snap to avoid LXD being upgrade at times that are not good for you.

53 (DNS) on UDP/TCP and 67/68 (DHCP) on UDP

ufw is just a wrapper for iptables

Looks like 67/68 might be the problem… but it was working before, so may be the order changed.
Yes, definitely caused by Firewall of ports 67, 68… But it worked find before update. Hope this helps someone.
Thank everyone for help.

Thanks again for your help… it needed port 67 & 68, added to firewall. So all of a sudden after upgrade it needed it.

We have been running into the same problem (cluster with FAN, no IPs) since upgrading and have been frantically searching for a solution; we can confirm that unblocking udp ports 67 and 68 fixes the issue for us as well. Thank you!

This is quite weird as DHCP has always been on 67/68 UDP, it’s not something that we could even change if we wanted to :slight_smile:

So I’m quite confused as to why things would only get blocked by firewalling now.