Hi. I run lxd 4.19 on a cluster of two debian 10 hypervisors. Most containers are fine, but there is one container that refuses to restart, and that I cannot delete.
$ lxc start foobar
Error: Failed to run: /snap/lxd/current/bin/lxd forkstart foobar /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/foobar/lxc.conf:
Try `lxc info --show-log foobar` for more info
$ lxc info --show-log foobar
Name: foobar
Status: STOPPED
Type: container
Architecture: x86_64
Location: hypervisor
Created: 2021/10/05 16:53 CEST
Last Used: 2021/11/08 11:52 CET
Snapshots:
+-------+----------------------+------------+----------+
| NAME | TAKEN AT | EXPIRES AT | STATEFUL |
+-------+----------------------+------------+----------+
| snap0 | 2021/11/08 10:48 CET | | NO |
+-------+----------------------+------------+----------+
Log:
lxc foobar 20211108105243.216 ERROR utils - utils.c:lxc_can_use_pidfd:1774 - Kernel does not support pidfds
lxc foobar 20211108105243.217 WARN conf - conf.c:lxc_map_ids:3574 - newuidmap binary is missing
lxc foobar 20211108105243.217 WARN conf - conf.c:lxc_map_ids:3580 - newgidmap binary is missing
lxc foobar 20211108105243.218 WARN conf - conf.c:lxc_map_ids:3574 - newuidmap binary is missing
lxc foobar 20211108105243.218 WARN conf - conf.c:lxc_map_ids:3580 - newgidmap binary is missing
lxc foobar 20211108105243.218 WARN cgfsng - cgroups/cgfsng.c:fchowmodat:1251 - No such file or directory - Failed to fchownat(38, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
lxc foobar 20211108105243.278 ERROR dir - storage/dir.c:dir_mount:194 - Operation not permitted - Failed to mount "/var/snap/lxd/common/lxd/containers/foobar/rootfs" onto "/var/snap/lxd/common/lxc/"
lxc foobar 20211108105243.278 ERROR conf - conf.c:lxc_mount_rootfs:1419 - Failed to mount rootfs "/var/snap/lxd/common/lxd/containers/foobar/rootfs" onto "/var/snap/lxd/common/lxc/" with options "(null)"
lxc foobar 20211108105243.278 ERROR conf - conf.c:lxc_setup_rootfs_prepare_root:3946 - Failed to setup rootfs for
lxc foobar 20211108105243.278 ERROR conf - conf.c:lxc_setup:4312 - Failed to setup rootfs
lxc foobar 20211108105243.278 ERROR start - start.c:do_start:1274 - Failed to setup container "foobar"
lxc foobar 20211108105243.278 ERROR sync - sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 3)
lxc foobar 20211108105243.292 WARN network - network.c:lxc_delete_network_priv:3617 - Failed to rename interface with index 0 from "eth0" to its initial name "veth5ab84386"
lxc foobar 20211108105243.292 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:867 - Received container state "ABORTING" instead of "RUNNING"
lxc foobar 20211108105243.293 ERROR start - start.c:__lxc_start:2073 - Failed to spawn container "foobar"
lxc foobar 20211108105243.293 WARN start - start.c:lxc_abort:1044 - No such process - Failed to send SIGKILL to 19683
lxc foobar 20211108105248.399 WARN conf - conf.c:lxc_map_ids:3574 - newuidmap binary is missing
lxc foobar 20211108105248.399 WARN conf - conf.c:lxc_map_ids:3580 - newgidmap binary is missing
lxc 20211108105248.404 ERROR af_unix - af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20211108105248.404 ERROR commands - commands.c:lxc_cmd_rsp_recv_fds:127 - Failed to receive file descriptors
$ sudo zfs umount tank/containers/foobar
I can kinda solve the restart issue by forcing mounting and unmounting:
$ sudo zfs mount tank/containers/foobar
cannot mount '/var/snap/lxd/common/lxd/storage-pools/tank/containers/foobar': directory is not empty
$ sudo ls /var/snap/lxd/common/lxd/storage-pools/tank/containers/foobar
backup.yaml
$ sudo rm /var/snap/lxd/common/lxd/storage-pools/tank/containers/foobar/backup.yaml
$ sudo zfs mount tank/containers/foobar
$ lxc start foobar
However deletion is still broken
$ lxc stop foobar
$ lxc delete foobar
Error: Error deleting storage volume: Failed to run: zfs destroy -r tank/containers/foobar: umount: /var/snap/lxd/common/shmounts/storage-pools/tank/containers/foobar: no mount point specified.
cannot unmount '/var/snap/lxd/common/shmounts/storage-pools/tank/containers/foobar': umount failed
$ sudo nsenter -t $(cat /var/snap/lxd/common/lxd.pid) -m
-bash-5.0# mount | grep foobar
tank/containers/foobar on /var/snap/lxd/common/shmounts/storage-pools/tank/containers/foobar type zfs (rw,xattr,posixacl)
-bash-5.0# umount tank/containers/foobar
umount: /var/snap/lxd/common/shmounts/storage-pools/tank/containers/foobar: no mount point specified.
Searching for files related to the buggy container, I find those ones. Would it be a bad idea to manually delete them?
$ sudo updatedb; sudo locate foobar
/var/snap/lxd/common/lxd/containers/foobar
/var/snap/lxd/common/lxd/devices/foobar
/var/snap/lxd/common/lxd/logs/foobar
/var/snap/lxd/common/lxd/logs/foobar/console.log
/var/snap/lxd/common/lxd/logs/foobar/forkstart.log
/var/snap/lxd/common/lxd/logs/foobar/lxc.conf
/var/snap/lxd/common/lxd/logs/foobar/lxc.log
/var/snap/lxd/common/lxd/logs/foobar/lxc.log.old
/var/snap/lxd/common/lxd/security/apparmor/cache/ea9ed67a.0/lxd-foobar
/var/snap/lxd/common/lxd/security/apparmor/profiles/lxd-foobar
/var/snap/lxd/common/lxd/security/seccomp/foobar
/var/snap/lxd/common/lxd/storage-pools/tank/containers/foobar
I wonder how I can solve this situation. Just deleting the container is OK. I also wonder how I can prevent this to happen again.
Do you have some clues?
Thank you for your help