Container DNS resolution after resuming from suspend

jmiahjones · June 24, 2022, 1:02pm

Hi all! I love LXD, and I use it constantly on my desktop to keep a consistent state for different software. Anyone who has fiddled with texlive versions knows that it can be a constant struggle to get your documents to compile. All that worry is a thing of the past.

I’m having a minor but persistent problem with DNS resolution for containers, though. Following the guide, I setup the systemd unit to provide DNS resolution. It works when I start it manually, but I noticed that it wasn’t working automatically after a resume from suspend, so I added the suspend.target to After= and WantedBy=.

$ cat /etc/systemd/system/lxd-dns-lxdbr0.service
[Unit]
Description=LXD per-link DNS configuration for lxdbr0
BindsTo=sys-subsystem-net-devices-lxdbr0.device
After=sys-subsystem-net-devices-lxdbr0.device suspend.target

[Service]
Type=oneshot
ExecStart=/usr/bin/resolvectl dns lxdbr0 10.0.10.1
ExecStart=/usr/bin/resolvectl domain lxdbr0 '~lxd'

[Install]
WantedBy=sys-subsystem-net-devices-lxdbr0.device suspend.target

If I look in my logs, I can see that it does now run right after the resume, but it seems to come before the NetworkManager activity. However, I don’t really see any other useful targets to hook into being reported in the logs. Has anyone gotten this to automatically work on Ubuntu 20.04?

jmiahjones · June 24, 2022, 7:34pm

I just wanted to inform anyone who (either at the present or in the future) might be wondering how to solve this problem. It was really obvious in hindsight, but since I’m no networking guru it took a bit of RTFM.

The solution I came up with is to tell NetworkManager the configuration we want to persist across reboots and suspend/resume cycles, and not to rely on the systemd unit. To do this, I opened up nm-connection-editor, configured lxdbr0 (or whichever bridge you want to point the system to), went under IPv4 settings and entered in the 10.0.10.1 address for the DNS server address and ~lxd under the search domains. Now my connection seems to persist after suspend/resume cycles.

At this point, I’ve disabled the systemd unit proposed in the guide, since NetworkManager is keeping track of the configuration now. If anyone else has a better or more preferred solution, let me know!

jmiahjones · June 27, 2022, 2:24pm

Hah! I guess I spoke too soon. A reboot made the following command fail: resolvectl query mycontainer.lxd.

I listed the connections using nmcli connection and found that there were two separate entries for lxdbr0: one active one (in green) attached to the lxdbr0 device, and an inactive one with a blank device. The inactive connection contains my settings.

$ nmcli connection
NAME        UUID    TYPE      DEVICE           
lxdbr0      uuid1   bridge    lxdbr0          
......................................
lxdbr0      uuid2   bridge    --

There is only one lxdbr0 device listed by nmcli device.

@tomp It seems like lxd is creating a new network connection on restart rather than using the existing one. I wonder what the expected behavior should be?

Steps to produce:

Setup lxd bridge lxdbr0 on an Ubuntu 20.04 Desktop (possibly also using 22.04).
Use nm-connection-editor to add entries to ipv4 dns servers and domains.
Reboot the host.
Run nmcli connection to verify two separate lxdbr0 entries.

tomp · June 27, 2022, 2:24pm

LXD creates the lxdbr0 interface when it starts yes, it is critical that no other system creates/manages it, as it will expect unfettered access to it.

jmiahjones · June 27, 2022, 2:32pm

I understand, LXD will need that level of control over the process. But this is getting back to my original problem: right now LXD is managing this bridge connection and I can’t seem to get the dns configuration to “stick.” I don’t have any network up/down signal coming from LXD that I can “hook” a systemd script to during suspend/resume cycles, and LXD (understandably) doesn’t provide its own dns management system within the commandline network interface.

If I put settings in NetworkManager, it will reapply them when the network goes down or comes up. Do you have any ideas for a workaround I can apply to the systemd script to run the resolvectl commands only once the bridge is up?

tomp · June 27, 2022, 2:35pm

I think looking further into the systemd approach would be more appropriate, although I am a bit confused why the lxdbr0 interface is being taken down during suspend, after all its not a physical interface.

jmiahjones · June 27, 2022, 3:02pm

I am, as well. I will take a bit more time to look into this and report back what I find. I’m thinking I can use ip monitor link label dev lxdbr0 to verify if the bridge is actually going down during suspend, or if something else entirely is going on.

jmiahjones · June 28, 2022, 3:25pm

You were right, @tomp. Running the above command gave me nothing when I activated suspend. I believe I’ve figured this out now. I’ve confirmed that resolvectl commands do not survive suspend cycles, as systemd-resolved seems to go down during the sleep event.

Just to verify, I also tried to change the per-link dns settings for a virbr0 bridge I had laying around from the dark times before I installed lxd. This confirmed that the resolved service loses those per-link settings regardless of the bridge.

sudo resolvectl dns 1.1.1.1 virbr0
sudo resolvectl status virbr0 # confirms it was set
sudo resolvectl dns lxdbr0 1.1.1.1
sudo resolvectl status lxdbr0 # also confirms it was set
sudo systemctl suspend
# both of the devices have lost their dns settings
sudo resolvectl status lxdbr0
sudo resolvectl status virbr0

Then I turned on debugging for resolved and found that the recommended systemd unit was getting activated just moments before systemd-resolved did its reinitialization process after wakeup.

Turning on debugging:

# I followed wiki.ubuntu.com/DebuggingSystemd
sudo systemctl edit --runtime systemd-resolved
# Added the following to turn on debugging:
# [Service]
# Environment=SYSTEMD_LOG_LEVEL=debug

sudo systemctl daemon-reload
sudo systemctl restart systemd-resolved

Here’s the race condition in the logs:

Jun 28 10:51:55 $hostname systemd[1]: systemd-suspend.service: Succeeded.
Jun 28 10:51:55 $hostname systemd[1]: Finished Suspend.
Jun 28 10:51:55 $hostname systemd[1]: Stopped target Sleep.
Jun 28 10:51:55 $hostname systemd[1]: Reached target Suspend.
Jun 28 10:51:55 $hostname systemd[1]: Starting LXD per-link DNS configuration for lxdbr0...
Jun 28 10:51:55 $hostname kernel: [ 2935.503384] audit: type=1107 audit(1656427915.699:173): pid=1974 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_signal"  bus="system" path="/org/freedesktop/login1" 
Jun 28 10:51:55 $hostname kernel: [ 2935.503384]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Jun 28 10:51:55 $hostname systemd[1]: Stopped target Suspend.
Jun 28 10:51:55 $hostname ModemManager[2076]: <info>  [sleep-monitor] system is resuming
Jun 28 10:51:55 $hostname systemd-resolved[10667]: Got message type=signal sender=:1.25 destination=n/a path=/org/freedesktop/login1 interface=org.freedesktop.login1.Manager member=PrepareForSleep cookie=427 reply_cookie=0 signature=b error-name=n/a error-message=n/a
Jun 28 10:51:55 $hostname systemd-resolved[10667]: Coming back from suspend, verifying all RRs...

So now I’ve just added a healthy buffer to the unit to avoid this race condition:

[Unit]
Description=LXD per-link DNS configuration for lxdbr0
BindsTo=sys-subsystem-net-devices-lxdbr0.device
After=sys-subsystem-net-devices-lxdbr0.device suspend.target

[Service]
Type=oneshot
ExecStart=/usr/bin/sleep 15
ExecStart=/usr/bin/resolvectl dns lxdbr0 10.0.10.1
ExecStart=/usr/bin/resolvectl domain lxdbr0 '~lxd'

[Install]
WantedBy=sys-subsystem-net-devices-lxdbr0.device suspend.target

You may want to mention this on the page for systemd integration. E.g. something like:

Hibernate and Suspend Behavior

When running LXD in a desktop environment, it may be common to suspend or hibernate the system. The previous configuration will not survive the wake from sleep, but adding suspend.target, hibernate.target, hybrid-sleep.target, and/or suspend-then-hibernate.target in the After= field will re-run the script as the system comes back online.

During the resume process, it is possible that the unit will run before the systemd-resolved service completes its post-wake initialization. One simple way to avoid this is by adding a small wait before the resolvectl commands are run, e.g.

[Service]
Type=oneshot
ExecStart=/usr/bin/sleep 5
ExecStart=/usr/bin/resolvectl dns lxdbr0 10.0.10.1
ExecStart=/usr/bin/resolvectl domain lxdbr0 '~lxd'

tomp · June 28, 2022, 3:47pm

Thanks for tracking it down!
I wonder if we can make it so that the LXD per-link unit waits until systemd-resolved is up and running before firing?

jmiahjones · June 28, 2022, 6:39pm

It’s difficult because it’s not that systemd-resolved goes down per-se, it’s that we have to wait for its post-wake process to complete. Here’s my abridged version of those logs for the transition into sleep:

NetworkManager[1978]: <info>  [1656427884.8057] manager: sleep: sleep requested (sleeping: no  enabled: yes)
NetworkManager[1978]: <info>  [1656427884.8159] manager: NetworkManager state is now ASLEEP
systemd[1]: Starting Network Manager Script Dispatcher Service...
dbus-daemon[1974]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
systemd-resolved[10667]: Removing scope on link $wifiLink, protocol dns, family *
systemd-resolved[10667]: Got message type=method_call sender=:1.17 destination=org.freedesktop.resolve1 path=/org/freedesktop/resolve1 interface=org.freedesktop.resolve1.Manager member=SetLinkDNS cookie=5236 reply_cookie=0 signature=ia(iay) error-name=n/a error-message=n/a
... #many more of these types of messages
systemd[1]: Reached target Sleep.

I’m totally just guessing based on the info I see in journalctl, but here’s what I see:

NetworkManager transistions to a “Suspend” state, where all its managed physical devices are turned off.
It then sends out a dispatch script which seems to instruct systemd-resolved to remove dns entries.
Once all that is done, the system sleeps.

When resuming, NetworkManager seems to come up first, and then the configurations are set up anew. Based on my success in the second post, it seems that NetworkManager instructs systemd-resolved to set up the dns according to its own configuration.

So when the lxdbr0 unit runs first, I’m guessing it is overwritten by the NetworkManager configuration that happens afterwards. We would have to get the systemd unit to somehow know when this exact sequence of events occurs.

I suppose an alternative could be to store the configuration in NetworkManager the way I originally attempted. I could put the commands in a systemd unit that runs on boot, which would solve the suspend problem. There would have to be a corresponding “tear-down” script as well for the shutdown process, since LXD will create the new bridge on boot. But this wouldn’t be robust to sudden poweroff.

So in summary, I don’t know exactly where the logic should be placed for maximum generalizability and minimum jank, but I am open to ideas!