Containers suddenly stopped working since move to core20 snap - No more IP's assigned

After months where the containers were up and running without a problem, they all stopped working as of last night. Usually, a system reboot works and or restarting the entire LXD (snap version, running v4.15).

But even when they aren’t working. I can “start” them up just fine, but no IP address is being assigned:

Example container info:

architecture: x86_64
config:
image.architecture: x86_64
image.description: Ubuntu 18.04 LTS minimal (20200506)
image.os: ubuntu
image.release: bionic
volatile.base_image: 572979f0119c180392944f756f3aa6e402ae7c11ec3380fc2e465b2cc76e309d
volatile.eth0.host_name: vethe7b3dc8d
volatile.eth0.hwaddr: 00:16:3e:50:c3:f7
volatile.idmap.base: “0”
volatile.idmap.current: ‘[{“Isuid”:true,“Isgid”:false,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000},{“Isuid”:false,“Isgid”:true,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000}]’
volatile.idmap.next: ‘[{“Isuid”:true,“Isgid”:false,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000},{“Isuid”:false,“Isgid”:true,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000}]’
volatile.last_state.idmap: ‘[{“Isuid”:true,“Isgid”:false,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000},{“Isuid”:false,“Isgid”:true,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000}]’
volatile.last_state.power: RUNNING
volatile.uuid: b31837cd-d4b6-4024-b188-bd50eff94a6d
devices:
eth0:
name: eth0
network: lxdbr0
type: nic
root:
path: /
pool: znc
type: disk
znc:
connect: tcp:127.0.0.1:xxx
listen: tcp:0.0.0.0:xxx
type: proxy

lxdbr0 that is being used managed mode:

config:
ipv4.address: 10.248.110.1/24
ipv4.nat: “true”
ipv6.address: fd42:78f0:8cd8:9b63::1/64
ipv6.nat: “true”
raw.dnsmasq: |
auth-zone=lxd
dns-loop-detect
description: “”
name: lxdbr0
type: bridge
used_by:

  • /1.0/instances/znc
    managed: true
    status: Created
    locations:
  • none

Any idea what might be wrong? Tomp mentioned to check the log files, though I have no idea where the log file is located.

This isn’t the first time btw that containers just stopped working, happened a few times in the past few months. But as I mentioned above, restart usually fixed that. Not this time though.

Try:

sudo grep /var/snap/lxd/common/lxd/logs/lxd.log dnsmasq

What OS host version are you using?

Getting: dnsmasq: No such file or directory.

Can’t find anything useful in logs though: t=2021-06-17T08:30:47+0200 lvl=info msg="LXD 4.15 is starting in normal mode" pa - Pastebin.com

As for the OS. Sorry, forgot to mention that. Ubuntu 20.04.2 LTS

That would be it:

t=2021-06-17T08:30:48+0200 lvl=eror msg="The dnsmasq process exited prematurely" driver=bridge err="Process exited with non-zero value 1" network=lxdbr0 project=default

It seems coincidental and might be related to:

But you’re running Focal on the host, which the other issue isn’t so might not be related.

Can you provide output of:

sudo ss -ulpn

Thanks

I notice you have raw dnsmasq, just a hunch, can you try unsetting that:

State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
UNCONN 0 0 10.0.3.1:53 0.0.0.0:* users:((“dnsmasq”,pid=14109,fd=6))
UNCONN 0 0 127.0.0.53%lo:53 0.0.0.0:* users:((“systemd-resolve”,pid=13876,fd=12))
UNCONN 0 0 0.0.0.0%lxcbr0:67 0.0.0.0:* users:((“dnsmasq”,pid=14109,fd=4))
UNCONN 0 0 0.0.0.0:27015 0.0.0.0:* users:((“hlds_linux”,pid=13977,fd=8))

And you mean unsetting it with:

sudo nsenter --mount=/run/snapd/ns/lxd.mnt – bash
LD_LIBRARY_PATH=/snap/lxd/current/lib/:/snap/lxd/current/lib/x86_64-linux-gnu/ /snap/lxd/current/bin/dnsmasq --help

I’m not familiar with this.

No not the snap commands, the link I posted showed you how to do it, but its:

lxc network unset lxdbr0 raw.dnsmasq

if I do that, only IPV6’s are assigned to the containers.

±-------------±--------±-----±----------------------------------------------±----------±----------+
| baker | RUNNING | | fd42:78f0:8cd8:9b63:216:3eff:fe93:526c (eth1) | CONTAINER | 0 |
| | | | fd42:78f0:8cd8:9b63:216:3eff:fe69:389b (eth0) | | |
±-------------±--------±-----±----------------------------------------------±----------±----------+

Can you reload LXD now:

sudo systemctl reload snap.lxd.daemon

And then restart your containers.

Just tried, sorry. Only IPV6 still:

Ips:
eth0: inet6 fd42:78f0:8cd8:9b63:216:3eff:fe69:389b vethfaa4b48f
eth0: inet6 fe80::216:3eff:fe69:389b vethfaa4b48f
eth1: inet6 fd42:78f0:8cd8:9b63:216:3eff:fe93:526c vethc3504a1d
eth1: inet6 fe80::216:3eff:fe93:526c vethc3504a1d
lo: inet 127.0.0.1
lo: inet6 ::1

Can you run dhclient inside your container please

Running it doesn’t seem to do anything. It just… “hangs”.

image

Can you show the output of sudo ss -ulpn on the LXD host please.

You have DHCP service listening.

BTW what is lxcbr0 (as opposed to lxdbr0)?

Can you check the output of sudo iptables-save and sudo nft list ruleset to check if a firewall could be blocking it.

No idea what lxcbr0 is, haven’t used it and is set as unmanaged.

As for iptables, I’m using UFW.

Well in that case can you kill the dnsmasq process listening on that interface to rule it out. Always best to keep things as simple as possible in my experience.

The output of those commands above please.

As for “sudo nft list ruleset” → Command not found.