Thanks for the sysdig info. I will check it out.
As you guys can imagine, operational maintenance and life-cycle management of large scale clusters will become a hot topic in the future. This is why tools that can get container stats to identify misbehaving containers (“lxd-top”) are so important.
As we know, it is easy to spin up a few LXD nodes and start running workloads. The problems start when you run lots of workloads and users start complaining of performance issues.
Maybe these tools already exist, and I just have not seem them yet?