Performance monitoring all containers on a host

Are there any recommendations for how to monitor performance or resource usage of all containers on a host over time? I see that systemd-cgtop exists, which is neat but limited to realtime data. I’m trying to debug two problems (in both cases running LXD 5.0.0):

  1. SSH connections to containers on one host intermittently hang for several seconds. I can’t correlate this with any resource or network contention on the host, and connections to the host are fine during this period, so it’s unclear why this is happening.=
  2. I have a long-running script that I run on two different LXD hosts; on host1 it takes about half of the time to run as on host2. Looking at the monitoring data of the hosts, I don’t see any resource contention (CPU, memory, disk, etc) that would explain this dramatic difference. Yes, the CPUs are different (model, specific clockspeed, etc), but I don’t think that is the issue here since resource usage never approaches fully-saturated

Are there any LXD-aware tools that I can install to debug the above problems and monitor resource utilization on any containers (even ephemeral ones) over time? Thanks!

LXD has its own Prometheus compatible metrics exporter which may be of interest?

I’ve actually got maybe 80% of a lxc top implementation using that API :slight_smile:

1 Like

Maybe netdata is what you are looking for. It is LXD container aware and most (not all) metrics can be viewed nicely with netdata

1 Like

@stgraber That would be really cool, today I was looking at lxc-top and ctop and these are quite useful for a quick look.

Hey I’m currently running one Netdata instance inside every container.
Should I move to one Netdata instance per host only?
What are the metrics which are not reported when installing on the host vs on the container?
Do you wish you had any of those missing metrics on your daily operations?

Main reason I’ve deployed it this way and I haven’t yet moved from this setup is because with one Netdata per container, I can give access to Netdata to (and only to) the owner of the container.
Is there anything similar allowing me to do that with Netdata on the host alone?