Some containers dont get assigned IP addresses

I am running ubuntu 20.04.3 (kernel 5.4.0-91-generic), with a stock (all defaults) install of lxd 4.21 via snap.

I noticed after creating ~5 containers (ubuntu 20.04), containers stopped being assigned IPs when i launched them, and required me to do a full reboot of the host for the container to receive an IP address. This worked fine as a workaround until I created an 11th container, and now it will only assign 10 IPs and never assign the 11th container an IP.

Here is my config for lxdbr0:

config:
  ipv4.address: 10.177.123.1/24
  ipv4.nat: "true"
  ipv6.address: fd42:7917:fc2a:4349::1/64
  ipv6.nat: "true"
description: ""
name: lxdbr0
type: bridge
used_by:
- /1.0/profiles/default
- (all of my containers)
managed: true
status: Created
locations:
- none

Configuration for the containers:

architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 20.04 LTS amd64 (release) (20211108)
  image.label: release
  image.os: ubuntu
  image.release: focal
  image.serial: "20211108"
  image.type: squashfs
  image.version: "20.04"
  volatile.base_image: bd2ffb937c95633a28091e6efc42d6c7b1474ad8eea80d6ed8df800e44c6bfdd
  volatile.eth0.host_name: veth6e6694e2
  volatile.eth0.hwaddr: 00:16:3e:59:96:95
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":100000>
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,">
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":100>
  volatile.last_state.power: RUNNING
  volatile.uuid: 2cd96e80-32e7-4173-8cd7-361ab49c72dc
devices: {}
ephemeral: false
profiles:
- default
stateful: false
description: ""

Default profile:

config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
name: default
used_by:
- (all of my containers)

As a temporary work around, I’ve created a second network for the containers that dont need to communicate with each other. I still have to reboot for containers to get an IP address, however, so it would be nice to get a proper solution for this.

Does systemctl reload snap.lxd.daemon get IP assignment working again?
If so, that’d suggest a dnsmasq issue. dnsmasq logs in syslog so there may be something relevant there.

I launched a new container, it did not receive an IP. I ran systemctl reload snap.lxd.daemon and checked again, the container still did not have an IP. All i can see in dmesg about dnsmasq is it loading apparmor profiles

What does sudo ps aux | grep dnsmasq and sudo ss -ulpn show?

sudo ps aux | grep dnsmasq
lxd       165913  0.0  0.0   7204  3696 ?        Ss   03:42   0:01 dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --pid-file= --no-ping --interface=lxdbr0 --dhcp-rapid-commit --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.177.123.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.177.123.2,10.177.123.254,1h --listen-address=fd42:7917:fc2a:4349::1 --enable-ra --dhcp-range ::,constructor:lxdbr0,ra-stateless,ra-names -s lxd --interface-name _gateway.lxd,lxdbr0 -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.raw -u lxd -g lxd
lxd       165944  0.0  0.0   7204  3812 ?        Ss   03:42   0:00 dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --pid-file= --no-ping --interface=lxdbr1 --dhcp-rapid-commit --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.47.86.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr1/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr1/dnsmasq.hosts --dhcp-range 10.47.86.2,10.47.86.254,1h --listen-address=fd42:75bb:8540:81da::1 --enable-ra --dhcp-range ::,constructor:lxdbr1,ra-stateless,ra-names -s lxd --interface-name _gateway.lxd,lxdbr1 -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr1/dnsmasq.raw -u lxd -g lxd
sudo ss -ulpn
State  Recv-Q Send-Q                      Local Address:Port   Peer Address:Port Process
UNCONN 0      0                              10.47.86.1:53          0.0.0.0:*     users:(("dnsmasq",pid=165944,fd=8))
UNCONN 0      0                            10.177.123.1:53          0.0.0.0:*     users:(("dnsmasq",pid=165913,fd=8))
UNCONN 0      0                           127.0.0.53%lo:53          0.0.0.0:*     users:(("systemd-resolve",pid=818,fd=12))
UNCONN 0      0                          0.0.0.0%lxdbr1:67          0.0.0.0:*     users:(("dnsmasq",pid=165944,fd=4))
UNCONN 0      0                          0.0.0.0%lxdbr0:67          0.0.0.0:*     users:(("dnsmasq",pid=165913,fd=4))
UNCONN 0      0                [fd42:75bb:8540:81da::1]:53             [::]:*     users:(("dnsmasq",pid=165944,fd=12))
UNCONN 0      0       [fe80::216:3eff:fedb:2e04]%lxdbr1:53             [::]:*     users:(("dnsmasq",pid=165944,fd=10))
UNCONN 0      0                [fd42:7917:fc2a:4349::1]:53             [::]:*     users:(("dnsmasq",pid=165913,fd=12))
UNCONN 0      0         [fe80::216:3eff:fe2e:fd]%lxdbr0:53             [::]:*     users:(("dnsmasq",pid=165913,fd=10))
UNCONN 0      0                             [::]%lxdbr0:547            [::]:*     users:(("dnsmasq",pid=165913,fd=6))
UNCONN 0      0                             [::]%lxdbr1:547            [::]:*     users:(("dnsmasq",pid=165944,fd=6))

Looks OK, what about your firewall, are you running one?

Please provide output of sudo iptables-save and sudo nft list ruleset.
Also lxc info | grep firewall:

I shouldnt be, this is a stock ubuntu install, however hetzner might have done something funky with it.

iptables-save
# Generated by iptables-save v1.8.4 on Tue Jan  4 14:17:44 2022
*filter
:INPUT ACCEPT [486:125194]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [385:213721]
-A INPUT -s 173.245.48.0/20 -j ACCEPT
-A INPUT -s 103.21.244.0/22 -j ACCEPT
-A INPUT -s 103.22.200.0/22 -j ACCEPT
-A INPUT -s 103.31.4.0/22 -j ACCEPT
-A INPUT -s 141.101.64.0/18 -j ACCEPT
-A INPUT -s 108.162.192.0/18 -j ACCEPT
-A INPUT -s 190.93.240.0/20 -j ACCEPT
-A INPUT -s 188.114.96.0/20 -j ACCEPT
-A INPUT -s 197.234.240.0/22 -j ACCEPT
-A INPUT -s 198.41.128.0/17 -j ACCEPT
-A INPUT -s 162.158.0.0/15 -j ACCEPT
-A INPUT -s 104.16.0.0/13 -j ACCEPT
-A INPUT -s 104.24.0.0/14 -j ACCEPT
-A INPUT -s 172.64.0.0/13 -j ACCEPT
-A INPUT -s 131.0.72.0/22 -j ACCEPT
-A INPUT -s 188.40.87.105/32 -j ACCEPT
-A INPUT -s 10.177.123.0/24 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 80 -j REJECT --reject-with icmp-port-unreachable
-A INPUT -p tcp -m tcp --dport 443 -j REJECT --reject-with icmp-port-unreachable
-A INPUT -p tcp -m tcp --dport 8080 -j REJECT --reject-with icmp-port-unreachable
-A INPUT -p tcp -m tcp --dport 8443 -j REJECT --reject-with icmp-port-unreachable
COMMIT
# Completed on Tue Jan  4 14:17:44 2022
# Generated by iptables-save v1.8.4 on Tue Jan  4 14:17:44 2022
*nat
:PREROUTING ACCEPT [45555:2827644]
:INPUT ACCEPT [45080:2781502]
:OUTPUT ACCEPT [2372:182076]
:POSTROUTING ACCEPT [2372:182076]
COMMIT
# Completed on Tue Jan  4 14:17:44 2022
# Generated by iptables-save v1.8.4 on Tue Jan  4 14:17:44 2022
*mangle
:PREROUTING ACCEPT [25440182:12241556933]
:INPUT ACCEPT [25304915:12132706266]
:FORWARD ACCEPT [135267:108850667]
:OUTPUT ACCEPT [22950393:37597519619]
:POSTROUTING ACCEPT [23085660:37706370286]
COMMIT
# Completed on Tue Jan  4 14:17:44 2022
# Generated by iptables-save v1.8.4 on Tue Jan  4 14:17:44 2022
*raw
:PREROUTING ACCEPT [25440182:12241556933]
:OUTPUT ACCEPT [22950393:37597519619]
COMMIT
# Completed on Tue Jan  4 14:17:44 2022
nft list ruleset
zsh: command not found: nft
lxc info | grep firewall
- network_firewall_filtering
- firewall_driver
  firewall: nftables

Can you do sudo apt install nftables and then sudo nft list ruleset please.

nft list ruleset
table inet lxd {
	chain pstrt.lxdbr0 {
		type nat hook postrouting priority srcnat; policy accept;
		@nh,96,24 700795 @nh,128,24 != 700795 masquerade
		@nh,64,64 18249281783980507977 @nh,192,64 != 18249281783980507977 masquerade
	}

	chain fwd.lxdbr0 {
		type filter hook forward priority filter; policy accept;
		ip version 4 oifname "lxdbr0" accept
		ip version 4 iifname "lxdbr0" accept
		ip6 version 6 oifname "lxdbr0" accept
		ip6 version 6 iifname "lxdbr0" accept
	}

	chain in.lxdbr0 {
		type filter hook input priority filter; policy accept;
		iifname "lxdbr0" tcp dport 53 accept
		iifname "lxdbr0" udp dport 53 accept
		iifname "lxdbr0" icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
		iifname "lxdbr0" udp dport 67 accept
		iifname "lxdbr0" icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
		iifname "lxdbr0" udp dport 547 accept
	}

	chain out.lxdbr0 {
		type filter hook output priority filter; policy accept;
		oifname "lxdbr0" tcp sport 53 accept
		oifname "lxdbr0" udp sport 53 accept
		oifname "lxdbr0" icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
		oifname "lxdbr0" udp sport 67 accept
		oifname "lxdbr0" icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, echo-request, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
		oifname "lxdbr0" udp sport 547 accept
	}

	chain pstrt.lxdbr1 {
		type nat hook postrouting priority srcnat; policy accept;
		@nh,96,24 667478 @nh,128,24 != 667478 masquerade
		@nh,64,64 18249278088313602522 @nh,192,64 != 18249278088313602522 masquerade
	}

	chain fwd.lxdbr1 {
		type filter hook forward priority filter; policy accept;
		ip version 4 oifname "lxdbr1" accept
		ip version 4 iifname "lxdbr1" accept
		ip6 version 6 oifname "lxdbr1" accept
		ip6 version 6 iifname "lxdbr1" accept
	}

	chain in.lxdbr1 {
		type filter hook input priority filter; policy accept;
		iifname "lxdbr1" tcp dport 53 accept
		iifname "lxdbr1" udp dport 53 accept
		iifname "lxdbr1" icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
		iifname "lxdbr1" udp dport 67 accept
		iifname "lxdbr1" icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
		iifname "lxdbr1" udp dport 547 accept
	}

	chain out.lxdbr1 {
		type filter hook output priority filter; policy accept;
		oifname "lxdbr1" tcp sport 53 accept
		oifname "lxdbr1" udp sport 53 accept
		oifname "lxdbr1" icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
		oifname "lxdbr1" udp sport 67 accept
		oifname "lxdbr1" icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, echo-request, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
		oifname "lxdbr1" udp sport 547 accept
	}
}

Looks fine to me. Please can you run sudo tcpdump -i lxdbr0 -nn on the LXD host and then start up a container that doesn’t get an IP and check whether it is making a DHCP request.

It would appear that there is no DHCPv4 traffic, at least not any that tcpdump is picking up. On a hunch I ran dhclient in the container and it has obtained an IP address, so the issue might be related to the ubuntu image itself?

Can you check the logs inside your containers and see why the dhcp client (networkd perhaps?) isn’t running?

Jan 05 11:48:29 test5 systemd[1]: Starting Network Service...
Jan 05 11:48:29 test5 systemd[1]: Failed to add a watch for /run/systemd/ask-password: inotify watch limit reached
Jan 05 11:48:29 test5 systemd-networkd[175]: Failed to connect to bus: No space left on device
Jan 05 11:48:29 test5 systemd-networkd[175]: Could not connect to bus: No space left on device
Jan 05 11:48:29 test5 systemd[1]: systemd-networkd.service: Main process exited, code=exited, status=1/FAILURE
Jan 05 11:48:29 test5 systemd[1]: systemd-networkd.service: Failed with result 'exit-code'.
Jan 05 11:48:29 test5 systemd[1]: Failed to start Network Service.

journactl is full of red text, all of which is either No space left on device or inotify watch limit reached. Looks like a lot of other services also failed.

Edit: here is df -h inside the container:

df -h
Filesystem                Size  Used Avail Use% Mounted on
default/containers/test5   52G  581M   51G   2% /
none                      492K  4.0K  488K   1% /dev
udev                       32G     0   32G   0% /dev/tty
tmpfs                     100K     0  100K   0% /dev/lxd
tmpfs                     100K     0  100K   0% /dev/.lxd-mounts
tmpfs                      32G     0   32G   0% /dev/shm
tmpfs                     6.3G  144K  6.3G   1% /run
tmpfs                     5.0M     0  5.0M   0% /run/lock
tmpfs                      32G     0   32G   0% /sys/fs/cgroup

Fixed by raising fs.inotify.max_user_watches on the host

Ah nice.

Also take a look at Production setup - LXD documentation for more production setup tips.