Need help: containers failed to start after snap refresh to 5.3

Exactly the same as you did except I mount the image to /DATA instead of /mnt/DATA

    ~  lxc info --show-log lemp                                                          1 ✘  5s  
Name: lemp
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2022/07/01 09:29 WIB
Last Used: 2022/07/01 09:38 WIB

Log:

lxc lemp 20220701023825.325 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc lemp 20220701023825.326 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc lemp 20220701023825.326 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc lemp 20220701023825.326 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc lemp 20220701023825.419 ERROR    conf - ../src/src/lxc/conf.c:mount_entry:2459 - Operation not permitted - Failed to mount "/var/snap/lxd/common/lxd/devices/lemp/disk.nginx.etc-nginx" on "/var/snap/lxd/common/lxc//etc/nginx"
lxc lemp 20220701023825.419 ERROR    conf - ../src/src/lxc/conf.c:lxc_setup:4375 - Failed to setup mount entries
lxc lemp 20220701023825.419 ERROR    start - ../src/src/lxc/start.c:do_start:1275 - Failed to setup container "lemp"
lxc lemp 20220701023825.419 ERROR    sync - ../src/src/lxc/sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 3)
lxc lemp 20220701023825.426 WARN     network - ../src/src/lxc/network.c:lxc_delete_network_priv:3631 - Failed to rename interface with index 0 from "eth0" to its initial name "vethe55eebe3"
lxc lemp 20220701023825.426 ERROR    lxccontainer - ../src/src/lxc/lxccontainer.c:wait_on_daemonized_start:877 - Received container state "ABORTING" instead of "RUNNING"
lxc lemp 20220701023825.426 ERROR    start - ../src/src/lxc/start.c:__lxc_start:2074 - Failed to spawn container "lemp"
lxc lemp 20220701023825.426 WARN     start - ../src/src/lxc/start.c:lxc_abort:1039 - No such process - Failed to send SIGKILL via pidfd 17 for process 5199
lxc lemp 20220701023830.483 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc lemp 20220701023830.483 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 20220701023830.502 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20220701023830.502 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:127 - Failed to receive file descriptors for command "get_state"

I can also confirm that ext4 loop device works ok.

When I rollback LXD to 5.1, everything came back!

What filesystem is this on?

I think we are hitting the same issue.

Ubuntu 20.04 host, Ubuntu 20.04 container, LXD 5.3, currently booted on 5.4.0-113-generic kernel, /var/snap/lxd/common/lxd on btrfs.

This has started happening on a container that worked fine previously.

We can work around the issue by setting the container as privileged.

lxc ourcontainer 20220702094110.198 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc ourcontainer 20220702094110.198 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc ourcontainer 20220702094110.200 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc ourcontainer 20220702094110.200 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc ourcontainer 20220702094110.201 WARN     cgfsng - ../src/src/lxc/cgroups/cgfsng.c:fchowmodat:1252 - No such file or directory - Failed to fchownat(40, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
lxc ourcontainer 20220702094110.343 ERROR    conf - ../src/src/lxc/conf.c:mount_entry:2459 - Operation not permitted - Failed to mount "/var/snap/lxd/common/lxd/devices/ourcontainer/disk.aadisable.sys-module-apparmor-parameters-enabled" on "/var/snap/lxd/common/lxc//sys/module/apparmor/parameters/enabled"
lxc ourcontainer 20220702094110.344 ERROR    conf - ../src/src/lxc/conf.c:lxc_setup:4375 - Failed to setup mount entries
lxc ourcontainer 20220702094110.344 ERROR    start - ../src/src/lxc/start.c:do_start:1275 - Failed to setup container "ourcontainer"
lxc ourcontainer 20220702094110.344 ERROR    sync - ../src/src/lxc/sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 3)
lxc ourcontainer 20220702094110.357 WARN     network - ../src/src/lxc/network.c:lxc_delete_network_priv:3631 - Failed to rename interface with index 0 from "eth0" to its initial name "veth981c4d9c"
lxc ourcontainer 20220702094110.357 ERROR    lxccontainer - ../src/src/lxc/lxccontainer.c:wait_on_daemonized_start:877 - Received container state "ABORTING" instead of "RUNNING"
lxc ourcontainer 20220702094110.357 ERROR    start - ../src/src/lxc/start.c:__lxc_start:2074 - Failed to spawn container "ourcontainer"
lxc ourcontainer 20220702094110.357 WARN     start - ../src/src/lxc/start.c:lxc_abort:1039 - No such process - Failed to send SIGKILL via pidfd 41 for process 2428210
lxc ourcontainer 20220702094115.677 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc ourcontainer 20220702094115.677 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 20220702094115.738 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20220702094115.738 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:127 - Failed to receive file descriptors for command "get_state"

I’m hitting this error as well, but only on NFS shares (mounted on the host) added as devices to the containers. Other containers that have regular folders from ext4 drives shared to them start fine.

This is on Jammy, kernel 5.15.0-40-generic. As others have done, reverting the LXD snap to 5.2 allows starting the containers with the same nfs shares without issues.

Edit: the containers storage is btrfs

I ran into this out of the blue today as well. One observation is that config device add does not exhibit the failure while the container is running, even for the exact same disks that cause the container to fail during startup.

This works fine:
lxc start tester
lxc config device add tester test1 disk source=/data/prizm/nasfs/dvr path=/mnt/dvr

But then lxc exec tester reboot… and the instance fails to come back up. Remove the disk, instance starts. Run device add again and it works, until you restart the container.

Thanks, that suggests it is perhaps some regression in liblxc itself rather than lxd, or done difference in how the disk devices are passed at start time vs when running. This is a useful data point and I’ll try and recreate it. Thanks

Looks like the fix for liblxc has been found here by @stgraber Cannot start lxc containers with gui profile - #11 by stgraber

1 Like

I can confirm that it fixed the problems for me :slight_smile:

1 Like

Fixed here as well. NFS mounts are shared properly with containers (switching to latest/candidate channel and updating to 5.3-924be6a)

Fixed my problem too after update to latest

1 Like