Hey,
I am trying to evaluate LXD for a mass container solution and I just build up my first node with upstream snap version of LXD on a baremetal Ubuntu 18.04 LTS. Works everything fine, but when I scale over now exactly 597 containers, the next containers fail to start.
I researched and found different solutions for this error and already tested it with the optimizations from https://lxd.readthedocs.io/en/latest/production-setup/. I also raised the values to maybe not good values, only to be sure.
For the test I just spawned 1000 containers with Ubuntu 18.04 image without network interface, sitting on local ZFS storage. The machine is big enough. AMD EPYC 64 Threads, 256GB RAM.
The interesting part is, that Proxmox VE6 upstream with their LXC implementation of 3.21 has the same problem. We got the same error there. They are using the same 5.3 kernel as I am using on the Ubuntu 18.04 LTS, just on Debian Buster.
I can also deliver SSH access, if somebody need a deeper view in the system.
I wonder what limit I get here. Output of all necessary infos bottom:
sysctl.conf:
vm.max_map_count = 262144
fs.inotify.max_queued_events = 167772160
fs.inotify.max_user_instances = 167772160
fs.inotify.max_user_watches = 167772160
kernel.keys.maxkeys = 80000
kernel.dmesg_restrict = 1
kernel.pid_max = 4194304
/etc/security/limits.conf:
* soft nofile 167772160
* hard nofile 167772160
root soft nofile 167772160
root hard nofile 167772160
* soft memlock unlimited
* hard memlock unlimited
root@lxd-test ~ # lxc start wondrous-spider
Error: Failed to run: /snap/lxd/current/bin/lxd forkstart wondrous-spider /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/wondrous-spider/lxc.conf:
Try lxc info --show-log wondrous-spider
for more info
root@lxd-test ~ # lxc info --show-log wondrous-spider
Name: wondrous-spider
Location: none
Remote: unix://
Architecture: x86_64
Created: 2020/03/08 15:48 UTC
Status: Stopped
Type: container
Profiles: default
lxc info --show-log wondrous-spider
lxc wondrous-spider 20200308173416.263 WARN cgfsng - cgroups/cgfsng.c:chowmod:1525 - No such file or directory - Failed to chown(/sys/fs/cgroup/unified//lxc.payload/wondrous-spider/memory.oom.group, 1000000000, 0)
lxc wondrous-spider 20200308173416.284 ERROR utils - utils.c:lxc_setup_keyring:1856 - Disk quota exceeded - Failed to create kernel keyring
lxc wondrous-spider 20200308173416.353 ERROR seccomp - seccomp.c:lxc_seccomp_load:1252 - Unknown error 524 - Error loading the seccomp policy
lxc wondrous-spider 20200308173416.353 ERROR sync - sync.c:__sync_wait:62 - An error occurred in another process (expected sequence number 5)
lxc wondrous-spider 20200308173416.353 ERROR start - start.c:lxc_abort:1122 - Function not implemented - Failed to send SIGKILL to 438070
lxc wondrous-spider 20200308173416.353 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:873 - Received container state “ABORTING” instead of “RUNNING”
lxc wondrous-spider 20200308173416.356 ERROR start - start.c:__lxc_start:2039 - Failed to spawn container “wondrous-spider”