Empty resolv.conf in containers after host reboot

Ubuntu 22.04 LTS (5.15.0-39-generic) with snap-based LXC 5.3:

$ lxc version
Client version: 5.3
Server version: 5.3

After an OS reboot (maybe LXC upgrade as well, who the heck knows), all of my Ubuntu 22 containers could no longer get DNS resolution to work (before that reboot they could). I noticed resolv.conf (which is just a symlink to another file) was empty.

I “solved” this by removing the /etc/resolv.conf symlink and manually recreating the file with nameserver in it, but I wonder if anyone has seen the same problem happen lately.

I don’t reboot my systems often, maybe once every 2-3 months, but thanks to recent experiences after each host reboot I check LXC to see what got broken, because more often than not, something gets broken after kernel or snap updates…

The /etc/resolve.conf symlink should be managed by systemd-resolved service, so sounds like that is potentially not running, can you confirm?

Also if you launch a fresh container does it still occur?

You also don’t mention which Ubuntu 22 image you are referring to, is it from ubuntu:22.04 or images:ubuntu/22.04?

  • I used lxc launch ubuntu:22.04 to create the VMs (sorry, I realized I called the VMs “containers”; obviously they’re not, but on LXC they’re lightweight so I used the wrong word)

  • I checked systemd-resolved, it was dead/stopped, although it’s enabled.

# sudo systemctl status systemd-resolved.service
○ systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/lib/systemd/system/systemd-resolved.service; enabled; vendor preset: enabled)
     Active: inactive (dead)
       Docs: man:systemd-resolved.service(8)
  • systemd-resolved service config file didn’t have any custom settings (every line was prefixed with #) and when I started service, it picked what I had hard-coded in /etc/resolv.conf (not yet symlinked) and in /run/systemd/resolve/resolv.conf I now see that DNS server plus whatever DHCP gives to host (or LXD to containers, I suppose)

I don’t mind the current setup (hard-coded DNS resolver in /etc/resolv.conf), at least it’s reliable.

I’ll try to provide additional details just in case others come across this post: this problem is a little bit similar to this one, although the cause may be different.

On the host, systemd-resolved shows this:

$ sudo systemctl status systemd-resolved
● systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/lib/systemd/system/systemd-resolved.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2022-08-17 16:15:09 UTC; 45min ago
       Docs: man:systemd-resolved.service(8)
   Main PID: 1200 (systemd-resolve)
     Status: "Processing requests..."
      Tasks: 1 (limit: 38123)
     Memory: 5.8M
        CPU: 708ms
     CGroup: /system.slice/systemd-resolved.service
             └─1200 /lib/systemd/systemd-resolved

Aug 17 16:15:09 jpbm systemd[1]: Starting Network Name Resolution...
Aug 17 16:15:09 jpbm systemd-resolved[1200]: Positive Trust Anchors:
Aug 17 16:15:09 jpbm systemd-resolved[1200]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Aug 17 16:15:09 jpbm systemd-resolved[1200]: Negative trust anchors: home.arpa 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 26.>
Aug 17 16:15:09 jpbm systemd-resolved[1200]: Using system hostname 'jpbm'.
Aug 17 16:15:09 jpbm systemd[1]: Started Network Name Resolution.
Aug 17 16:26:46 jpbm systemd-resolved[1200]: lxdbr0: Bus client set DNS server list to:
Aug 17 16:52:37 jpbm systemd-resolved[1200]: Using degraded feature set UDP instead of UDP+EDNS0 for DNS server

There’s also dnsmasq running:

$ ps -ef | grep masq
lxd         2243    2138  0 16:15 ?        00:00:00 dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --pid-file= --no-ping --interface=lxdbr0 --dhcp-rapid-commit --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address= --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-option-force=26,1450 --dhcp-range,,1h -s lxd --interface-name _gateway.lxd,lxdbr0 -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.raw -u lxd -g lxd

On the host DNS resolution works fine and resolv.conf points to stub-resolv.conf:

$ sudo dir -lat /etc/resolv.conf
lrwxrwxrwx 1 root root 39 Feb 28 11:00 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf

$ resolvectl status
       Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub

Link 2 (enp1s0f0)
    Current Scopes: DNS
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server:
       DNS Servers:

Link 3 (enp1s0f1)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 4 (lxdbr0)
    Current Scopes: DNS
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server:
       DNS Servers:

Inside of a VM (also Ubuntu 22.04) bridged to lxdbr0, ../run/systemd/resolve/stub-resolv.conf is empty and /etc/resolv.conf symlinking to it is also empty.

I tried that today:

$ lxc launch ubuntu:22.04 dnstest
Creating dnstest
Starting dnstest                              

ubuntu@bm:~$ lxc shell dnstest
root@dnstest:~# nslookup www.google.com
;; communications error to connection refused

root@dnstest:~# dir -lat /etc/resolv.conf 
lrwxrwxrwx 1 root root 39 Aug 10 06:56 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf
root@dnstest:~# sudo cat ../run/systemd/resolve/stub-resolv.conf
sudo: unable to resolve host dnstest: Temporary failure in name resolution
cat: ../run/systemd/resolve/stub-resolv.conf: No such file or directory

root@dnstest:~# sudo systemctl status systemd-resolved
sudo: unable to resolve host dnstest: Temporary failure in name resolution
○ systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/lib/systemd/system/systemd-resolved.service; enabled; vendor preset: enabled)
     Active: inactive (dead)
       Docs: man:systemd-resolved.service(8)

root@dnstest:~# sudo systemctl start systemd-resolved
sudo: unable to resolve host dnstest: Temporary failure in name resolution

root@dnstest:~# sudo systemctl status systemd-resolved
● systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/lib/systemd/system/systemd-resolved.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2022-08-17 17:07:32 UTC; 2s ago
       Docs: man:systemd-resolved.service(8)
   Main PID: 320 (systemd-resolve)
     Status: "Processing requests..."
      Tasks: 1 (limit: 38123)
     Memory: 4.4M
        CPU: 27ms
     CGroup: /system.slice/systemd-resolved.service
             └─320 /lib/systemd/systemd-resolved

Aug 17 17:07:32 dnstest systemd[1]: Starting Network Name Resolution...
Aug 17 17:07:32 dnstest systemd-resolved[320]: Positive Trust Anchors:
Aug 17 17:07:32 dnstest systemd-resolved[320]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Aug 17 17:07:32 dnstest systemd-resolved[320]: Negative trust anchors: home.arpa 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 2>
Aug 17 17:07:32 dnstest systemd-resolved[320]: Using system hostname 'dnstest'.
Aug 17 17:07:32 dnstest systemd[1]: Started Network Name Resolution.

If I now start systemd-resolved and cat /etc/resolv.conf (i.e. the file it’s symlinked to), now it’s not empty.

root@dnstest:~# dir -lat /etc/resolv.conf 
lrwxrwxrwx 1 root root 39 Aug 10 06:56 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf

root@dnstest:~# cat ../run/systemd/resolve/stub-resolv.conf

options edns0 trust-ad
search .

root@dnstest:~# nslookup www.google.com

** server can't find www.google.com: SERVFAIL

root@dnstest:~# sudo systemctl status systemd-resolved
● systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/lib/systemd/system/systemd-resolved.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2022-08-17 17:07:32 UTC; 39s ago
       Docs: man:systemd-resolved.service(8)
   Main PID: 320 (systemd-resolve)
     Status: "Processing requests..."
      Tasks: 1 (limit: 38123)
     Memory: 4.4M
        CPU: 27ms
     CGroup: /system.slice/systemd-resolved.service
             └─320 /lib/systemd/systemd-resolved

Aug 17 17:07:32 dnstest systemd[1]: Starting Network Name Resolution...
Aug 17 17:07:32 dnstest systemd-resolved[320]: Positive Trust Anchors:
Aug 17 17:07:32 dnstest systemd-resolved[320]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Aug 17 17:07:32 dnstest systemd-resolved[320]: Negative trust anchors: home.arpa 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 2>
Aug 17 17:07:32 dnstest systemd-resolved[320]: Using system hostname 'dnstest'.
Aug 17 17:07:32 dnstest systemd[1]: Started Network Name Resolution.

If I shut down this VM, and start it again, after login DNS resolution still doesn’t work and systemd-resolved is dead. If I start systemd-resolved manually, DNS still fails the same as before, so it’s consistent:

  • systemd-resolved is dead upon startup even if it was enabled
  • if it’s dead, ../run/systemd/resolve/stub-resolv.conf is empty, if it’s running, ../run/systemd/resolve/stub-resolv.conf is not empty - same as when I enabled it in the VM earlier)
  • either way, DNS resolution does not work regardless of whether systemd-resolved is running or not in guest VM. Even when I removed /etc/resolv.conf and created a new file rather than symlink, with nameserver in it, resolution in guest VM still didn’t work (which I find strange - I have that “workaround” work in other VMs on the same host, but I created those several weeks ago, maybe they are slightly different in some way)

Edit: I should also post this, as I’ve re-installed OS since that and the problem persists with another version (host OS is still Jammy). LXD is still running from snapd, but this version:

$ snap list --all lxd
Name  Version        Rev    Tracking    Publisher   Notes
lxd   5.0.0-b0287c1  22923  5.0/stable  canonical✓  -