Container sometimes turns into STOPPED state after a container reboot

Rarely, rebooting a container by executing a reboot from inside of it will simply power it off, and we need to start it up again with lxc start.

Subsequent reboots work fine, so this is very hard to reproduce.

For example I just now rebooted a 22.04 Ubuntu container(8 weeks uptime) on a 20.04 host with LXD 5.13 on btrfs, and it went straight to STOPPED.

This has been happening for years I believe all the way back to LXD 3.x, it’s just so impossible to reproduce for me that I haven’t gotten around to reporting it until now.

How could I possibly diagnose it or get any meaningful information about this issue?

That can happen if there’s a storage or device stop issue while the container shuts down, that should normally result in an error in lxd.log though.

If it’s somewhat reproducible with a particular container, you could run lxc monitor --pretty in a separate terminal while running reboot to catch any useful debug output.

Just had it happen again, nothing is in the lxd.log about it however, last error/warn logs are from 3 days ago, checked both snap.lxd.daemon.service and /var/snap/lxd/common/lxd/logs/lxd.log.

Container results in the unwanted STOPPED state after issuing a reboot on the container itself.

Subsequent reboots are normal as usual.

This is 22.04 host and client on 5.15.0-58-generic with LXD 5.13 using btrfs.

Thankfully I have been able to reproduce this with another container on the same host with lxc monitor attached, but not sure if it has anything helpful in it.

All container names but the container in question is censored. https://haste.rys.pw/idihafadij.sql