ghomem
(Gustavo Homem)
March 8, 2022, 12:41pm
1
Hi everyone,
As of LXD 4.0.8 LTS would it be possible to confirm on the stability of the snap lxcfs.loadavg=true
flag?
I found some scary threads here:
opened 09:55AM - 21 May 20 UTC
closed 01:01PM - 28 May 20 UTC
Bug
Incomplete
This issue hit us hard in the last 48 hours and took a long time to figure out. … I will try and keep this as concise as possible. The issue is reliably reproducible and I can grant you access to a system exhibiting this behavior as we speak.
**System setup**
Focal Fossa (Upgraded from Bionic) with HWE
```
# uname -r
5.4.0-29-generic
```
```
# lxd --version
4.0.1
```
All limits and security settings are set as per recommended LXD defaults https://github.com/lxc/lxd/blob/master/doc/production-setup.md
**Symptoms**
On any system where `snap set lxd lxcfs.loadavg=true` then after a while, any Focal or Bionic container on that system will see increasing SSH connection times and ultimately fail to complete SSH authentication completely, breaking direct SSH access to that container.
The busier the system is, the quicker this happens.
Xenial containers do not seem to be affected.
The SSH Daemon in the container dies at this point:
```
# /usr/sbin/sshd -d -D -p 222
.. output cropped
Postponed publickey for admin from 2.110.243.147 port 61874 ssh2 [preauth]
debug1: userauth-request for user admin service ssh-connection method publickey [preauth]
debug1: attempt 2 failures 0 [preauth]
debug1: temporarily_use_uid: 1001/27 (e=0/0)
debug1: trying public key file /home/admin/.ssh/authorized_keys
debug1: fd 4 clearing O_NONBLOCK
debug1: matching key found: file /home/admin/.ssh/authorized_keys, line 1 RSA SHA256:plvv+n6kLeQP5SEes64kDnVFN0X5GUo7ZR9OsOAnQ5A
debug1: restore_uid: 0/0
debug1: do_pam_account: called
Accepted publickey for admin from 2.110.243.147 port 61874 ssh2: RSA SHA256:plvv+n6kLeQP5SEes64kDnVFN0X5GUo7ZR9OsOAnQ5A
debug1: monitor_child_preauth: admin has been authenticated by privileged process
debug1: Enabling compression at level 6. [preauth]
debug1: monitor_read_log: child log fd closed
debug1: PAM: establishing credentials
```
See more detail in: https://github.com/lxc/lxd/issues/7385
All of our hosts encountered this issue as we did an infrastructure-wide upgrade, save a single host (identical setup, also LXD 4.0.1 from Snap) which for some reason reports:
```
# snap get lxd lxcfs.loadavg
error: snap "lxd" has no "lxcfs" configuration option
```
So lxcfs is not activere there and it got spared. Why is this happening BTW? Maybe this is fodder for another ticket ...
lxcfs.loadavg worked fine for us for 1 year+ on Bionic and recently we saw no issues with Bionic+LXD4.0.1 and lxcfs.loadavg=true - it was not untill the Focal upgrade this started breaking.
**Theories**
Due to the creeping nature of the issue a few things come to mind:
- Maybe some system limits/kernel defaults have been altered or lowered and PAM is doing something which is hitting these?
- Memory/pointer/handles leak or lacking garbage collection in lxcfs.loadavg on Focal?
As mentioned we have a system exhibiting this exact behavior live and running which we would be happy to provide access to. It is a debug/test system so you can mess around there as much as you want.
opened 09:05AM - 28 Jan 21 UTC
closed 01:23PM - 28 Jan 21 UTC
Hi!
I have several dozens of very same servers: Ubuntu Focal, lxd:
```
lxd … 4.10 19009
```
And on one of them inside containers commands like `ps u` lock console infinitely, do not react on ctrl-c, cannot be killed. Only hard reset of the server helped.
This servers differs from the others in CPU (AMD EPYC 7502P 32-Core Processor).
I found https://github.com/lxc/lxcfs/issues/407 thread, tried to `umount -l /proc/stat` and `ps u` started working.
I have `lxcfs.cfs=true`, `lxcfs.loadavg=true`, `lxd lxcfs.pidfd=true` everywhere, now I am trying to disable it one by one to find out if it could help.
After reboot `ps u` works inside container for about 2 hours, then it stops. Approximately in the same time I have this in hosts log:
```
Jan 27 12:15:10 hetzner5 lxd.daemon[2440]: proc_cpuview.c: 815: cpuview_proc_stat: Write to cache was truncated
```
Perhaps lxcfs/lxd has some bug that appears only on systems with high cpu thread count? This cpu has 64 threads.
which I am not sure if still apply to 4.0.8.
Also, far as I could tell, as of 4.0.8 the default configuration already tracks processes and CPU usage per container… it is only load average that is read from the server instead. Can you confirm?
Thanks in advance,
Gustavo
ghomem
(Gustavo Homem)
March 8, 2022, 1:11pm
2
Here is a reference discuss thread:
Can you be any more specific or does it still require compiling by yourself?
Looking forward for an update on the stability of this. Thank you in advance.
stgraber
(Stéphane Graber)
March 8, 2022, 6:34pm
3
I’ve been running several clusters with it enabled for over a year now and haven’t seen any obvious issue. The main problem is that a misbehaving/attacking guest could cause lxcfs on the host to use up more and more memory.
If you’re dealing with mostly trusted workloads and/or have good monitoring of the memory usage on the host to detect potential attacks, it should be fine to turn on.
1 Like
ghomem
(Gustavo Homem)
March 9, 2022, 6:10pm
4
Thank you @stgraber . The cases I have in mind have both permanent memory monitoring (host included) and trusted workloads so we will probably try this.
ghomem
(Gustavo Homem)
March 9, 2022, 7:41pm
5
One more thing: after running
snap set lxd lxcfs.loadavg=true
what do we need to restart for make the change active?
stgraber
(Stéphane Graber)
March 9, 2022, 7:48pm
6
The simple answer would be to restart the entire system.
LXCFS cannot be restarted unless all containers are stopped so we pretty actively stay away from it in the snap logic. If you absolutely must avoid a full system restart, you could stop all containers, then run snap stop lxd
and check if lxcfs is still running, if it is, kill it and do snap start lxd
.
1 Like