What is a recommended way of tracking per-container resource usage?
I am hosting a bunch of servers as LXC containers on a single host, and am noticing that over the course of a few days of starting the containers, CPU usage is growing until it hits a ceiling of around 95%.
With the classic Linux system tools like top, htop, etc, I can see that /sbin/init processes, the root process of every container, consume CPU, but cannot tell which containers or which tasks within these containers are the source, and therefore am struggling to decide where to invest into optimizations.
Can anybody here recommend any monitoring tools supporting cgroups, or ways to query LXD?
Someone else can go in the all the different options / upcoming features intended to help here but a âpoor mans solutionâ to help right now might be to look at âCPU Usage (In seconds)â a bash script like the following might give a hint
HOSTS=($(lxc list -c n --format csv))
for HOST in "${HOSTS[@]}"
do
echo "${HOST} ..."
lxc info ${HOST} | grep "CPU usage (in seconds)"
done
Very much looking forward to the upcoming monitoring features. I wasnât aware htop could enable a CGROUP column, which is a good start when it comes to tracking down the worst CPU hogs at least.
There is also the python-based tool ctop which can be installed via pip. However the console display flickers pretty intensively