Hello guys, and one again congratulations for your wonderful work !
I’m facing an issue with my snap lxd installation. I think that after the latest lxd update (revision 18077) I noticed that “lxc list” just remained hanging without showing a list of the containers.
In the LXD log, the last message was
lvl=info msg=“Loading daemon configuration”
Tried reverting lxd to a previous release, but things got stuck so I forcibly rebooted the server.
After a reboot, lxc list successfully lists the containers (I also have 2 VM’s) and I noticed that one of them has an “ERROR” status.
I didn’t need it anyway, so I tried deleting it. That’s when the fun starts
root@gw:depozit/virtualwin # lxc delete vm
Error: The instance is currently running, stop it first or pass --force
root@gw:depozit/virtualwin # lxc delete vm --force
Error: Instance is running
root@gw:depozit/virtualwin # lxc delete vm --force-local --force
Error: Instance is running
root@gw:depozit/virtualwin # lxc stop vm
Error: dial unix /var/snap/lxd/common/lxd/logs/vm/qemu.monitor: connect: connection refused
root@gw:depozit/virtualwin # lxc stop --force vm
root@gw:depozit/virtualwin # lxc delete vm
Error: The instance is currently running, stop it first or pass --force
There’s a good chance that deleting qemu.monitor in that dir will solve the issue, but I’m a bit confused as to why LXD doesn’t treat the dead socket as the VM being stopped.
After also moving qemu.pid out of the folder (the PID inside didn’t correspond to anything running at this time), lxd deleted the VM.
Please let me know if I can offer you any other details so maybe it helps anyone in my situation or helps you debug this
We’ve had instances where the socket is dead but the process lives on hence ERROR state rather than STOPPED, however in this case it looks like we need to give lxc stop the ability to detect a dead socket and dead process and cleanup the state files.
Well, seems that this was a situation the other way around, meaning the socket and pid existed, but the process with corresponding PID wasn’t in fact running