LXC commands under heavy load

,

Sometimes under heavy load (many VMs) getting i/o timeout error when running a lxc commands:

write unix @->/var/snap/lxd/common/lxd/unix.socket: i/o timeout

It’s due to heavy load on the system, I’m using retry mechanism to reduce failures but still getting nevertheless, does increasing system limits will help?

According to this doc, there is no need to touch /etc/security/limits.conf since I’m using snap

I did update accordingly /etc/sysctl.conf

fs.aio-max-nr = 524288
fs.inotify.max_queued_events = 1048576
fs.inotify.max_user_instances = 1048576
fs.inotify.max_user_watches = 1048576
kernel.dmesg_restrict = 1
kernel.keys.maxbytes = 2000000
kernel.keys.maxkeys = 2000
net.core.bpf_jit_limit = 3000000000
net.ipv4.neigh.default.gc_thresh3 = 8192
net.ipv6.neigh.default.gc_thresh3 = 8192
vm.max_map_count = 262144

still getting same error once in a while, any advice?

Hi @mezobari,
According to your system values, net.core.bpf_jit_limit is suspicious to me, If you get those values from productionvalues the paremeter is not mentioned. And I get sysctl: setting key “net.core.bpf_jit_limit”: Invalid argument
when I apply your config.
Regards.

2 Likes

tbh, doesn’t remember why there is that field at all, I have bash script to set these values along with other stuff, maybe copilot suggested and I just approved (tabbed) it?! :laughing:

We recently modified the way the lxc exec websocket proxy logic works, this will be in LXD 5.14, so will be interesting to see if it helps when that is released.

1 Like