VMs in Stopped state while running after Snap Refresh

Hello,

After an automatic refresh of LXD, my VM seems to be stopped:

root@n3:/home/ubuntu# lxc info v1
Name: v1
Location: none
Remote: unix://
Architecture: x86_64
Created: 2020/01/23 11:41 UTC
Status: Stopped
Type: virtual-machine
Profiles: vm

I get this error when I try to start it:

root@n3:/home/ubuntu# lxc start v1
Error: Failed to run: /snap/lxd/current/bin/lxd forklimits limit=memlock:unlimited:unlimited – /snap/lxd/13439/bin/qemu-system-x86_64 -S -name v1 -uuid 8e085ed3-6437-4166-9707-548d725ab638 -daemonize -cpu host -nographic -serial chardev:console -nodefaults -no-reboot -no-user-config -sandbox on,obsolete=deny,elevateprivileges=allow,spawn=deny,resourcecontrol=deny -readconfig /var/snap/lxd/common/lxd/logs/v1/qemu.conf -pidfile /var/snap/lxd/common/lxd/logs/v1/qemu.pid -D /var/snap/lxd/common/lxd/logs/v1/qemu.log -chroot /var/snap/lxd/common/lxd/virtual-machines/v1 -runas lxd: : exit status 1
Try lxc info --show-log v1 for more info

The logs says:

qemu-system-x86_64:/var/snap/lxd/common/lxd/logs/v1/qemu.conf:115: vhost-vsock: unable to set guest cid: Address already in use

And indeed the address is in use because I can still find the qemu process:

root@n3:/home/ubuntu# ps aux | grep v1
lxd 11045 44.5 3.5 20563560 7015600 ? Sl Feb05 12823:09 qemu-system-x86_64 -S -name v1 -uuid 8e085ed3-6437-4166-9707-548d725ab638 -daemonize -cpu host -nographic -serial chardev:console -nodefaults -no-reboot -no-user-config -readconfig /var/snap/lxd/common/lxd/logs/v1/qemu.conf -pidfile /var/snap/lxd/common/lxd/logs/v1/qemu.pid -D /var/snap/lxd/common/lxd/logs/v1/qemu.log -chroot /var/snap/lxd/common/lxd/virtual-machines/v1 -runas lxd

Is this normal behavior ? How can I make LXC aware that the VM is still running without having to kill the process and restart the VM ?

Thanks for your help.

Léo

Nope, that’s not expected behavior. I suspect the issue is that we accidentally expire the sockets on restart.

@tomp can you look into this? My guess is that we don’t special case the qemu socket in the log expiry logic, so on restart we may expire the old socket. We should special case it as we do lxc.conf and qemu.conf.

Yep NP, will look into after the GPT for VMs.

I’ve put up a PR for this: