Are there any recommendations for how to monitor performance or resource usage of all containers on a host over time? I see that
systemd-cgtop exists, which is neat but limited to realtime data. I’m trying to debug two problems (in both cases running LXD 5.0.0):
- SSH connections to containers on one host intermittently hang for several seconds. I can’t correlate this with any resource or network contention on the host, and connections to the host are fine during this period, so it’s unclear why this is happening.=
- I have a long-running script that I run on two different LXD hosts; on host1 it takes about half of the time to run as on host2. Looking at the monitoring data of the hosts, I don’t see any resource contention (CPU, memory, disk, etc) that would explain this dramatic difference. Yes, the CPUs are different (model, specific clockspeed, etc), but I don’t think that is the issue here since resource usage never approaches fully-saturated
Are there any LXD-aware tools that I can install to debug the above problems and monitor resource utilization on any containers (even ephemeral ones) over time? Thanks!