Container fails to start after creation from image

kamzar1 · March 3, 2022, 4:44pm

LXD 4.23 on Host Ubuntu focal
Tried alpine and hirsute from images:

Error: Failed to run: /snap/lxd/current/bin/lxd forkstart alp1 /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/alp1/lxc.conf:

lxd log:
t=2022-03-03T16:29:54+0000 lvl=info msg=“Starting container” action=start created=2022-03-03T16:24:11+0000 ephemeral=false instance=alp1 instanceType=container project=default stateful=false used=2022-03-03T16:29:40+0000
t=2022-03-03T16:30:00+0000 lvl=eror msg=“Failed starting container” action=start created=2022-03-03T16:24:11+0000 ephemeral=false instance=alp1 instanceType=container project=default stateful=false used=2022-03-03T16:29:40+0000

container log:
lxc alp1 20220303163333.702 WARN conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc alp1 20220303163333.702 WARN conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc alp1 20220303163333.703 WARN conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc alp1 20220303163333.703 WARN conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc alp1 20220303163333.704 WARN cgfsng - cgroups/cgfsng.c:fchowmodat:1252 - No such file or directory - Failed to fchownat(40, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
lxc alp1 20220303163333.995 ERROR start - start.c:start:2164 - No such file or directory - Failed to exec “/sbin/init”
lxc alp1 20220303163333.995 ERROR sync - sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 7)
lxc alp1 20220303163334.498 WARN network - network.c:lxc_delete_network_priv:3631 - Failed to rename interface with index 0 from “eth0” to its initial name “vetha9ff1ebd”
lxc alp1 20220303163334.144 WARN network - network.c:lxc_delete_network_priv:3631 - Failed to rename interface with index 0 from “eth1” to its initial name “veth32e4806b”
lxc alp1 20220303163334.146 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:877 - Received container state “ABORTING” instead of “RUNNING”
lxc alp1 20220303163334.146 ERROR start - start.c:__lxc_start:2074 - Failed to spawn container “alp1”
lxc alp1 20220303163334.146 WARN start - start.c:lxc_abort:1039 - No such process - Failed to send SIGKILL via pidfd 41 for process 92944
lxc alp1 20220303163339.200 WARN conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc alp1 20220303163339.200 WARN conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 20220303163339.234 ERROR af_unix - af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20220303163339.234 ERROR commands - commands.c:lxc_cmd_rsp_recv_fds:127 - Failed to receive file descriptors for command “get_state”

unmanaged bridge, profile:
config:
limits.cpu: “8”
limits.cpu.allowance: 80%
limits.cpu.priority: “8”
limits.memory: 8GB
limits.memory.enforce: soft
limits.memory.swap: “false”
limits.memory.swap.priority: “8”
limits.network.priority: “8”
description: admin
devices:
eth0:
name: eth0
nictype: bridged
parent: lxdbr0
type: nic
eth1:
name: eth1
nictype: bridged
parent: br3
type: nic
root:
path: /
pool: pl
size: 300MB
type: disk
st1:
path: /st1
source: /zp3/st1
type: disk
name: admin
limits.network.priority: “8” which had worked before, now caused below error, so I removed it, yet hasnt solve the issue of not starting.
t=2022-03-03T16:22:53+0000 lvl=eror msg=“Failed to apply network priority” err=“Can’t set network priority on stopped container” instance=ub1 instanceType=container project=101

kamzar1 · March 4, 2022, 9:56am

Appreciate any feedback on this, it probably started after switching zfs mounts to legacy, perhaps other reason, but in an unchanged environment and config, I am not able to start a container after creation.

tomp · March 4, 2022, 11:05am

This looks like the main issue:

 lxc alp1 20220303163333.995 ERROR start - start.c:start:2164 - No such file or directory - Failed to exec “/sbin/init”

There doesn’t seem to be an init inside the container.

Can you try the same config with a freshly created container please.

kamzar1 · March 4, 2022, 11:13am

lxc launch images:ubuntu/focal focal1
Creating focal1
Starting focal1
Error: Failed to run: /snap/lxd/current/bin/lxd forkstart 101_focal1 /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/101_focal1/lxc.conf:
Try lxc info --show-log local:focal1 for more info

Name: focal1
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2022/03/04 12:12 CET
Last Used: 2022/03/04 12:12 CET

Log:

lxc 101_focal1 20220304111211.366 WARN conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc 101_focal1 20220304111211.367 WARN conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 101_focal1 20220304111211.368 WARN conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc 101_focal1 20220304111211.368 WARN conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 101_focal1 20220304111211.370 WARN cgfsng - cgroups/cgfsng.c:fchowmodat:1252 - No such file or directory - Failed to fchownat(40, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
lxc 101_focal1 20220304111211.601 ERROR start - start.c:start:2164 - No such file or directory - Failed to exec “/sbin/init”
lxc 101_focal1 20220304111211.601 ERROR sync - sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 7)
lxc 101_focal1 20220304111211.614 WARN network - network.c:lxc_delete_network_priv:3631 - Failed to rename interface with index 0 from “eth1” to its initial name “veth64805348”
lxc 101_focal1 20220304111211.614 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:877 - Received container state “ABORTING” instead of “RUNNING”
lxc 101_focal1 20220304111211.614 ERROR start - start.c:__lxc_start:2074 - Failed to spawn container “101_focal1”
lxc 101_focal1 20220304111211.614 WARN start - start.c:lxc_abort:1039 - No such process - Failed to send SIGKILL via pidfd 41 for process 341634
lxc 101_focal1 20220304111216.727 WARN conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc 101_focal1 20220304111216.728 WARN conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 20220304111216.783 ERROR af_unix - af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20220304111216.783 ERROR commands - commands.c:lxc_cmd_rsp_recv_fds:127 - Failed to receive file descriptors for command “get_state”

kamzar1 · March 4, 2022, 11:16am

Previousely created containers all running, can start/restart.
Though cant create any new.

lxc storage info pl
info:
description: “”
driver: zfs
name: pl
space used: 781.85MiB
total space: 96.00GiB
used by:
images:

114451d96705190afb14c6f8fc816382c58b9f9d0cc7ca98668b09a1dc28ad08?project=101
91ca1df35f991db6cc0b298db3e39938a1327bb1fabccb7450ac13928b301462
91ca1df35f991db6cc0b298db3e39938a1327bb1fabccb7450ac13928b301462?project=101
instances:
alpx1?project=101
alpx2?project=101
alpx3?project=101
focal1?project=101
focal2?project=101
profiles:
admin
default?project=101
pl4
pl4?project=101
storage-pools:
pl

tomp · March 4, 2022, 11:17am

Can you show snap info lxd please to see which revision you’re on. It feels like there maybe something wrong with the image (either how its unpacked or how its cached on your system).

kamzar1 · March 4, 2022, 11:20am

name: lxd
summary: LXD - container and VM manager
publisher: Canonical✓
store-url: https://snapcraft.io/lxd
contact: Issues · lxc/incus · GitHub
license: unset
description: |
LXD is a system container and virtual machine manager.

It offers a simple CLI and REST API to manage local or remote instances,
uses an image based workflow and support for a variety of advanced features.

Images are available for all Ubuntu releases and architectures as well
as for a wide number of other Linux distributions. Existing
integrations with many deployment and operation tools, makes it work
just like a public cloud, except everything is under your control.

LXD containers are lightweight, secure by default and a great
alternative to virtual machines when running Linux on Linux.

LXD virtual machines are modern and secure, using UEFI and secure-boot
by default and a great choice when a different kernel or operating
system is needed.

With clustering, up to 50 LXD servers can be easily joined and managed
together with the same tools and APIs and without needing any external
dependencies.

Supported configuration options for the snap (snap set lxd [=…]):

- ceph.builtin: Use snap-specific Ceph configuration [default=false]
- ceph.external: Use the system's ceph tools (ignores ceph.builtin) [default=false]
- criu.enable: Enable experimental live-migration support [default=false]
- daemon.debug: Increase logging to debug level [default=false]
- daemon.group: Set group of users that can interact with LXD [default=lxd]
- daemon.preseed: Pass a YAML configuration to `lxd init` on initial start
- daemon.syslog: Send LXD log events to syslog [default=false]
- lvm.external: Use the system's LVM tools [default=false]
- lxcfs.pidfd: Start per-container process tracking [default=false]
- lxcfs.loadavg: Start tracking per-container load average [default=false]
- lxcfs.cfs: Consider CPU shares for CPU usage [default=false]
- openvswitch.builtin: Run a snap-specific OVS daemon [default=false]
- shiftfs.enable: Enable shiftfs support [default=auto]

For system-wide configuration of the CLI, place your configuration in
/var/snap/lxd/common/global-conf/ (config.yml and servercerts)
commands:

lxd.benchmark
lxd.buginfo
lxd.check-kernel
lxd.lxc
lxd.lxc-to-lxd
lxd
lxd.migrate
services:
lxd.activate: oneshot, enabled, inactive
lxd.daemon: simple, enabled, active
lxd.user-daemon: simple, enabled, inactive
snap-id: J60k4JY0HppjwOjW8dZdYc8obXKxujRu
tracking: latest/edge
refresh-date: yesterday at 09:38 CET
channels:
latest/stable: 4.23 2022-02-24 (22525) 81MB -
latest/candidate: 4.23 2022-03-03 (22609) 82MB -
latest/beta: ↑
latest/edge: git-209b19e 2022-03-03 (22604) 82MB -
4.23/stable: 4.23 2022-02-24 (22525) 81MB -
4.23/candidate: 4.23 2022-03-03 (22609) 82MB -
4.23/beta: ↑
4.23/edge: ↑
4.22/stable: 4.22 2022-02-12 (22407) 79MB -
4.22/candidate: 4.22 2022-02-11 (22407) 79MB -
4.22/beta: ↑
4.22/edge: ↑
4.0/stable: 4.0.9 2022-02-25 (22526) 71MB -
4.0/candidate: 4.0.9 2022-02-24 (22541) 71MB -
4.0/beta: ↑
4.0/edge: git-d0940f2 2022-02-24 (22535) 71MB -
3.0/stable: 3.0.4 2019-10-10 (11348) 55MB -
3.0/candidate: 3.0.4 2019-10-10 (11348) 55MB -
3.0/beta: ↑
3.0/edge: git-81b81b9 2019-10-10 (11362) 55MB -
2.0/stable: 2.0.12 2020-08-18 (16879) 38MB -
2.0/candidate: 2.0.12 2021-03-22 (19859) 39MB -
2.0/beta: ↑
2.0/edge: git-82c7d62 2021-03-22 (19857) 39MB -
installed: git-209b19e (22604) 82MB -

kamzar1 · March 4, 2022, 11:25am

Could this be caused by storage.images_volume ?
lxc config show
config:
backups.compression_algorithm: zstd
core.https_address: :7443
core.trust_password: true
images.auto_update_interval: “0”
images.compression_algorithm: zstd
images.default_architecture: x86_64
images.remote_cache_expiry: “0”
storage.images_volume: pl/images

kamzar1 · March 4, 2022, 12:28pm

this works: (repository ubuntu:)
lxc launch ubuntu:f containerx
this breaks: (repository images:)
lxc launch images:ubuntu/focal/amd64 containerx

Either images in images: are faulty or lxd cant spawn any thing from images:

tomp · March 4, 2022, 12:31pm

Oh wait, you’re tracking tracking: latest/edge which is counter to what you said originally LXD 4.23, because latest/edge is unstable and further on than the 4.23 release.

Is there a reason you’re on edge? You’re likely being affected by https://github.com/lxc/lxd/pull/9975 (so a snap refresh lxd might help, but you probably also need to do lxc image delete on the affected cached images).

tomp · March 4, 2022, 12:38pm

If you’re on latest/edge you can expect further breakages in the future as the snap is automatically built from the main LXD git branch every few hours.

kamzar1 · March 4, 2022, 1:05pm

Right.
Have set the snap channel to latest/stable.
Restart service lxd. Still same.
Name Version Rev Tracking Publisher Notes
core18 20211215 2284 latest/stable canonical✓ base
core20 20220215 1361 latest/stable canonical✓ base
distrobuilder 2.0 1125 latest/stable stgraber classic
lxd 4.23 22525 latest/stable canonical✓ -
snapd 2.54.3 14978 latest/stable canonical✓ snapd

tomp · March 4, 2022, 1:11pm

Yes I think the damage has been done. You will need to use lxc image ls and lxc image delete to remove the images you’ve downloaded already, and delete the containers that used the broken unpacked images.

kamzar1 · March 4, 2022, 1:20pm

I had deleted all images as you mentioned.
Now i tried some fresh other mages and it is working now. Everything fine.

Thanks very much for your help Thomas.

tomp · March 4, 2022, 1:21pm

Excellent

kamzar1 · March 13, 2022, 12:33pm

Running into same issue again.
Above mentioned steps not helping.

snap list:
lxd 4.23 22652 latest/stable canonical✓ -
all images deleted, all broken lxd deleted, still cant start when launching from new image. Tried so far repositories images: ubuntu:

lxc info --show-log local:h21
Name: h21
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2022/03/13 13:22 CET
Last Used: 2022/03/13 13:22 CET

lxc h21 20220313122230.636 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc h21 20220313122230.636 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc h21 20220313122230.637 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc h21 20220313122230.637 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc h21 20220313122230.638 WARN     cgfsng - cgroups/cgfsng.c:fchowmodat:1252 - No such file or directory - Failed to fchownat(40, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
lxc h21 20220313122230.751 ERROR    start - start.c:start:2164 - No such file or directory - Failed to exec "/sbin/init"
lxc h21 20220313122230.752 ERROR    sync - sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 7)
lxc h21 20220313122230.752 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:877 - Received container state "ABORTING" instead of "RUNNING"
lxc h21 20220313122230.753 ERROR    start - start.c:__lxc_start:2074 - Failed to spawn container "h21"
lxc h21 20220313122230.753 WARN     start - start.c:lxc_abort:1039 - No such process - Failed to send SIGKILL via pidfd 41 for process 20405
lxc h21 20220313122235.764 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc h21 20220313122235.764 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 20220313122235.816 ERROR    af_unix - af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20220313122235.816 ERROR    commands - commands.c:lxc_cmd_rsp_recv_fds:127 - Failed to receive file descriptors for command "get_state"

lxd.log:

t=2022-03-13T13:09:59+0100 lvl=info msg="Creating container" ephemeral=false instance=alp1 instanceType=container project=default
t=2022-03-13T13:09:59+0100 lvl=info msg="Created container" ephemeral=false instance=alp1 instanceType=container project=default
t=2022-03-13T13:09:59+0100 lvl=info msg="Image unpack started" imageFile=/var/snap/lxd/common/lxd/images/69ab23b357ef5f020de8088116023f14ec56552ff534c5aae77bc1a4858f7245 vol=69ab23b357ef5f020de8088116023f14ec56552ff534c5aae77bc1a4858f7245
t=2022-03-13T13:10:00+0100 lvl=info msg="Image unpack stopped" imageFile=/var/snap/lxd/common/lxd/images/69ab23b357ef5f020de8088116023f14ec56552ff534c5aae77bc1a4858f7245 vol=69ab23b357ef5f020de8088116023f14ec56552ff534c5aae77bc1a4858f7245

Vitalii · March 13, 2022, 2:35pm

same problem

lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.2 LTS
Release:        20.04
Codename:       focal

snap list

Name    Version   Rev    Tracking       Publisher   Notes
core18  20211215  2284   latest/stable  canonical✓  base
core20  20220304  1376   latest/stable  canonical✓  base
lxd     4.23      22652  latest/stable  canonical✓  -
snapd   2.54.3    14978  latest/stable  canonical✓  snapd

tomp · March 14, 2022, 4:50pm

Please can you refresh onto LXD 4.24 sudo snap refresh lxd --channel=latest/candidate.

Then please show the output of lxc storage volume ls <pool> and lxc image ls.

If this fixes it then wait until LXD 4.24 is moved to stable and then you can refresh back to latest/stable channel.

tomp · March 14, 2022, 5:05pm

Please can you show output of lxc config show?

Vitalii · March 15, 2022, 6:11am

sudo snap refresh lxd —channel=latest/candidate

error: cannot refresh "lxd", "---channel=latest/candidate": snap "---channel=latest/candidate" is
       not installed

lxc config show

config:
  storage.backups_volume: s860/backup
  storage.images_volume: s860/images