Some containers stopped working (maybe after migration to snap), Invalid argument - Failed to mount

TCMC · February 20, 2020, 9:08pm

Hey guys! LXD is great! But I kinda prefer the dpkg packages, honestly. Too bad that it won’t be maintained anymore. I say this because it was working a couple of weeks ago and “out of blue” it broke (never saw this with rock solid dpkg packages).

So, after migrating to the snap LXD package, my contained worked! However, today, some are failing, here is the log:

ubuntu@server:~$ lxc info --show-log coscmpt-1
Name: coscmpt-1
Location: none
Remote: unix://
Architecture: x86_64
Created: 2019/12/24 04:19 UTC
Status: Stopped
Type: container
Profiles: oscmpts

Log:

lxc coscmpt-1 20200219234710.330 WARN     cgfsng - cgroups/cgfsng.c:chowmod:1525 - No such file or directory - Failed to chown(/sys/fs/cgroup/unified//lxc.payload/coscmpt-1/memory.oom.group, 1000000000, 0)
lxc coscmpt-1 20200219234710.404 ERROR    utils - utils.c:safe_mount:1212 - Invalid argument - Failed to mount "/proc/sys/fs" onto "/var/snap/lxd/common/lxc//proc/sys/fs"
lxc coscmpt-1 20200219234710.404 ERROR    conf - conf.c:mount_entry:2019 - Invalid argument - Failed to mount "/proc/sys/fs" on "/var/snap/lxd/common/lxc//proc/sys/fs"
lxc coscmpt-1 20200219234710.404 ERROR    conf - conf.c:lxc_setup:3608 - Failed to setup mount entries
lxc coscmpt-1 20200219234710.404 ERROR    start - start.c:do_start:1321 - Failed to setup container "coscmpt-1"
lxc coscmpt-1 20200219234710.405 ERROR    sync - sync.c:__sync_wait:62 - An error occurred in another process (expected sequence number 5)
lxc coscmpt-1 20200219234710.405 WARN     network - network.c:lxc_delete_network_priv:3377 - Failed to rename interface with index 41 from "ens3" to its initial name "veth85b72178"
lxc coscmpt-1 20200219234710.405 ERROR    start - start.c:lxc_abort:1122 - Function not implemented - Failed to send SIGKILL to 5831
lxc coscmpt-1 20200219234710.405 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:873 - Received container state "ABORTING" instead of "RUNNING"
lxc coscmpt-1 20200219234710.405 ERROR    start - start.c:__lxc_start:2039 - Failed to spawn container "coscmpt-1"
lxc 20200219234710.563 WARN     commands - commands.c:lxc_cmd_rsp_recv:135 - Connection reset by peer - Failed to receive response for command "get_state"

Here is its LXD Profille (that was working about a week ago):

ubuntu@server:~$ lxc profile show oscmpts
config:
  raw.lxc: lxc.mount.entry=/proc/sys/fs proc/sys/fs proc bind,rw 0 0
description: OpenStack Compute Nodes (QEMU Hypervisors) - Unrestricted
devices:
  ens3:
    name: ens3
    nictype: bridged
    parent: br-bond0
    type: nic
  root:
    path: /
    pool: default
    type: disk
name: oscmpts
used_by:
- /1.0/instances/coscmpt-1
- /1.0/instances/coscmpt-2

Any idea about how to fix it?

stgraber · February 20, 2020, 9:48pm

What are you trying to do with that raw.lxc entry?

It looks like you’re hitting a mount ordering issue where /proc/sys isn’t yet mounted when you’re attempting to then mount that entry on top.

If you absolutely must do this (seems like a very bad idea), it would probably be best done through a disk device rather than using raw.lxc.

TCMC · February 20, 2020, 9:57pm

Without it, the openstack-ansible fails to deploy my container as a QEMU-based (kvm) Compute Node.

I have Ceph OSD LXD Containers with exactly same entry and they’re working! And this was working before a week ago on this very same host. Today, this container doesn’t start up.

I would be happy to try the disk approach! Could you share the exact syntax?

Thanks!

stgraber · February 20, 2020, 11:14pm

sys-fs:
  path: /proc/sys/fs
  source: /proc/sys/fs
  type: disk

I’m pretty confused as to how the container works with this in place though, it would wreck havoc with cgroups for one thing…