Rarely, rebooting a container by executing a reboot from inside of it will simply power it off, and we need to start it up again with lxc start.
Subsequent reboots work fine, so this is very hard to reproduce.
For example I just now rebooted a 22.04 Ubuntu container(8 weeks uptime) on a 20.04 host with LXD 5.13 on btrfs, and it went straight to STOPPED.
This has been happening for years I believe all the way back to LXD 3.x, it’s just so impossible to reproduce for me that I haven’t gotten around to reporting it until now.
How could I possibly diagnose it or get any meaningful information about this issue?
That can happen if there’s a storage or device stop issue while the container shuts down, that should normally result in an error in lxd.log though.
If it’s somewhat reproducible with a particular container, you could run lxc monitor --pretty in a separate terminal while running reboot to catch any useful debug output.
Just had it happen again, nothing is in the lxd.log about it however, last error/warn logs are from 3 days ago, checked both snap.lxd.daemon.service and /var/snap/lxd/common/lxd/logs/lxd.log.
Container results in the unwanted STOPPED state after issuing a reboot on the container itself.
Subsequent reboots are normal as usual.
This is 22.04 host and client on 5.15.0-58-generic with LXD 5.13 using btrfs.
Thankfully I have been able to reproduce this with another container on the same host with lxc monitor attached, but not sure if it has anything helpful in it.
I’ve been having the same issue with LXC 4.0.2 (no LXD) in Alt Linux (kernel 5.10.176). It happened from time to time in the past but after I changed limits as described in Linux Containers - LXD - Has been moved to Canonical - it happens almost every time now.
lxc-monitor shows nothing useful:
‘ts-print’ changed state to [STARTING]
‘ts-print’ changed state to [RUNNING]
‘ts-print’ exited with status [0]
‘ts-print’ changed state to [STOPPING]
‘ts-print’ changed state to [STOPPED]
‘ts-print’ changed state to [STARTING]
‘ts-print’ changed state to [RUNNING]
‘ts-print’ changed state to [ABORTING]
‘ts-print’ changed state to [STOPPING]
‘ts-print’ changed state to [STOPPED]
lxc 20230601135455.311 TRACE commands - ../src/src/lxc/commands.c:lxc_cmd:521 - Opened new command socket connection fd 34 for command "get_devpts_fd"
lxc ansible 20230601135455.319 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3621 - newuidmap binary is missing
lxc ansible 20230601135455.319 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3627 - newgidmap binary is missing
lxc ansible 20230601135919.273 ERROR conf - ../src/src/lxc/conf.c:run_buffer:322 - Script exited with status 127
lxc ansible 20230601135919.277 ERROR start - ../src/src/lxc/start.c:lxc_end:944 - Failed to run "lxc.hook.stop" hook
lxc ansible 20230601135919.942 ERROR conf - ../src/src/lxc/conf.c:run_buffer:322 - Script exited with status 127
lxc ansible 20230601135919.942 ERROR start - ../src/src/lxc/start.c:lxc_end:985 - Failed to run lxc.hook.post-stop for container "ansible"
lxc ansible 20230601135919.942 WARN start - ../src/src/lxc/start.c:lxc_end:987 - Container will be stopped instead of rebooted
I also captured the requested lxd.log but I think I might have to seriously go through it to comb through unwanted information, let me know if that is necessary, or I can send it privately.
I found something interesting: if I use “lxc-start -F” to start a container in the foreground mode, reboot works fine every time. I don’t know what this means but maybe it will help.