Failing to Scale Past 1022 Containers

Hello,

I am posting here as I have encountered an issue when trying to deploy greater than 1022 containers. Another user posted a very similar scaling issue that I had also encountered, but in that case there were some seccomp issues presenting in the logs. I was stuck on that issue for some time (I should have just posted here), but thankfully someone else did so – the solution was around increasing the bpf_jit_limit:

The fix there helped me pass 600 containers and I slowly progressed to a higher container count of just over 1000, where I seem to have encountered another constraint that isn’t being made obvious in the logs. I have applied all the configs that are advised in the production setup doc, as well as various others that I have experimented with through trial and error, some of which may not be necessary anymore or may be irrelevant. The failure to start isn’t container specific, i.e. I can shut down a few others & be able to start ones that were failing. All containers are identical reconfigured clones.

When starting containers 1000 to 1022, I begin seeing intermittent internet connectivity on the most recently started containers, and upon reaching container 1023, I am unable to start it. Please see below my configs & should be able to provide ssh access if it helps. I am fairly new to LXC & linux in general, so apologies if I am slow off the mark with debugging.

My host is a proxmox 6.2 node, but due to the similarity with the abovementioned thread, I was hoping that I’d find similar guidance on here on what to try next, but please correct me if I’m wrong.

The machine has significant resources, with 2 x AMD EPYC 7742 and 1TB RAM, and at 1000 containers I am at about 70% RAM usage and averaging just under 5% CPU, so I feel that there should be a good amount of capacity left. Containers are of an Ubuntu 18.04 image & latest LXC through proxmox.

lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2

Below are the relevant configs & logs from lxc start:

sysctl.conf:
net.ipv4.neigh.default.gc_interval = 3600
net.ipv6.neigh.default.gc_interval = 3600
net.ipv4.neigh.default.gc_stale_time = 3600
net.ipv6.neigh.default.gc_stale_time = 3600
net.ipv4.neigh.default.gc_thresh1 = 80000
net.ipv4.neigh.default.gc_thresh2 = 90000
net.ipv4.neigh.default.gc_thresh3 = 100000
net.ipv6.neigh.default.gc_thresh1 = 80000
net.ipv6.neigh.default.gc_thresh2 = 90000
net.ipv6.neigh.default.gc_thresh3 = 100000
vm.swappiness=100
kernel.keys.maxkeys = 100000000
kernel.keys.maxbytes = 200000000
kernel.dmesg_restrict = 1
vm.max_map_count = 262144
net.ipv6.conf.default.autoconf = 0
fs.inotify.max_queued_events = 167772160
fs.inotify.max_user_instances = 167772160
fs.inotify.max_user_watches = 167772160
net.core.bpf_jit_limit = 30000000000
kernel.keys.root_maxbytes = 2000000000
kernel.keys.root_maxkeys = 1000000000
kernel.pid_max = 4194304
kernel.keys.gc_delay = 300
kernel.keys.persistent_keyring_expiry = 259200
fs.aio-max-nr = 524288

/etc/security/limits.conf:

  • soft  nofile      167772160     unset
    
  • hard  nofile      167772160     unset
    

root soft nofile 167772160 unset
root hard nofile 167772160 unset

  • soft  memlock     unlimited   unset
    
  • hard  memlock     unlimited   unset
    

lxc-start --logfile /tmp/lxc-start1137.log -n 1137 --logpriority TRACE

first error showing:

lxc-start 1137 20200610161718.245 ERROR conf - conf.c:lxc_setup_boot_id:3250 - Permission denied - Failed to mount /dev/.lxc-boot-id to /proc/sys/kernel/random/boot_id

extract of logs:

lxc-start 1137 20200610161718.125 INFO conf - conf.c:run_script_argv:340 - Executing script “/usr/share/lxcfs/lxc.mount.hook” for container “1137”, config section “lxc”
lxc-start 1137 20200610161718.144 INFO conf - conf.c:run_script_argv:340 - Executing script “/usr/share/lxc/hooks/lxc-pve-autodev-hook” for container “1137”, config section “lxc”
lxc-start 1137 20200610161718.216 INFO conf - conf.c:lxc_fill_autodev:1152 - Populating “/dev”
lxc-start 1137 20200610161718.216 DEBUG conf - conf.c:lxc_fill_autodev:1218 - Bind mounted host device node “/dev/full” onto “/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/full”
lxc-start 1137 20200610161718.216 DEBUG conf - conf.c:lxc_fill_autodev:1218 - Bind mounted host device node “/dev/null” onto “/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/null”
lxc-start 1137 20200610161718.216 DEBUG conf - conf.c:lxc_fill_autodev:1218 - Bind mounted host device node “/dev/random” onto “/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/random”
lxc-start 1137 20200610161718.216 DEBUG conf - conf.c:lxc_fill_autodev:1218 - Bind mounted host device node “/dev/tty” onto “/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/tty”
lxc-start 1137 20200610161718.217 DEBUG conf - conf.c:lxc_fill_autodev:1218 - Bind mounted host device node “/dev/urandom” onto “/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/urandom”
lxc-start 1137 20200610161718.217 DEBUG conf - conf.c:lxc_fill_autodev:1218 - Bind mounted host device node “/dev/zero” onto “/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/zero”
lxc-start 1137 20200610161718.217 INFO conf - conf.c:lxc_fill_autodev:1222 - Populated “/dev”
lxc-start 1137 20200610161718.217 DEBUG conf - conf.c:lxc_setup_dev_console:1618 - Mounted pts device “/dev/pts/1018” onto “/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/console”
lxc-start 1137 20200610161718.217 INFO utils - utils.c:lxc_mount_proc_if_needed:1200 - I am 1, /proc/self points to “1”
lxc-start 1137 20200610161718.239 TRACE conf - conf.c:lxc_pivot_root:1427 - pivot_root("/usr/lib/x86_64-linux-gnu/lxc/rootfs") successful
lxc-start 1137 20200610161718.245 ERROR conf - conf.c:lxc_setup_boot_id:3250 - Permission denied - Failed to mount /dev/.lxc-boot-id to /proc/sys/kernel/random/boot_id
lxc-start 1137 20200610161718.281 DEBUG conf - conf.c:lxc_setup_devpts:1521 - Mount new devpts instance with options “gid=5,newinstance,ptmxmode=0666,mode=0620,max=1024”
lxc-start 1137 20200610161718.281 DEBUG conf - conf.c:lxc_setup_devpts:1536 - Created dummy “/dev/ptmx” file as bind mount target
lxc-start 1137 20200610161718.281 DEBUG conf - conf.c:lxc_setup_devpts:1541 - Bind mounted “/dev/pts/ptmx” to “/dev/ptmx”
lxc-start 1137 20200610161718.281 ERROR conf - conf.c:lxc_allocate_ttys:929 - Inappropriate ioctl for device - Failed to create tty 0
lxc-start 1137 20200610161718.281 ERROR conf - conf.c:lxc_create_ttys:1015 - Failed to allocate ttys
lxc-start 1137 20200610161718.281 ERROR start - start.c:do_start:1231 - Failed to setup container “1137”
lxc-start 1137 20200610161718.281 ERROR sync - sync.c:__sync_wait:41 - An error occurred in another process (expected sequence number 5)
lxc-start 1137 20200610161718.281 DEBUG network - network.c:lxc_delete_network:3693 - Deleted network devices
lxc-start 1137 20200610161718.281 TRACE start - start.c:lxc_serve_state_socket_pair:492 - Sent container state “ABORTING” to 5
lxc-start 1137 20200610161718.281 TRACE start - start.c:lxc_serve_state_clients:427 - Set container state to ABORTING
lxc-start 1137 20200610161718.281 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:852 - Received container state “ABORTING” instead of “RUNNING”
lxc-start 1137 20200610161718.281 TRACE start - start.c:lxc_serve_state_clients:430 - No state clients registered
lxc-start 1137 20200610161718.281 ERROR lxc_start - tools/lxc_start.c:main:308 - The container failed to start
lxc-start 1137 20200610161718.281 ERROR lxc_start - tools/lxc_start.c:main:311 - To get more details, run the container in foreground mode
lxc-start 1137 20200610161718.281 ERROR start - start.c:__lxc_start:1957 - Failed to spawn container “1137”
lxc-start 1137 20200610161718.281 TRACE start - start.c:lxc_serve_state_clients:427 - Set container state to ABORTING
lxc-start 1137 20200610161718.281 TRACE start - start.c:lxc_serve_state_clients:430 - No state clients registered
lxc-start 1137 20200610161718.281 WARN start - start.c:lxc_abort:1025 - No such process - Failed to send SIGKILL via pidfd 42 for process 1602616
lxc-start 1137 20200610161718.281 TRACE start - start.c:lxc_serve_state_clients:427 - Set container state to STOPPING

Please could anyone advise where to look next? Thanks.

Looks like you may be hitting /proc/sys/kernel/pty/max?

Amazing, that was it - thanks very much! I was worried I was going to be stuck there for a while.

added to sysctl.conf:

kernel.pty.max = 10000