Cgroup v2 I/O limits

Heya,

I noticed when I tried to switch to cgroup v2 that on LXD 4.0 i/o limits are not implemented yet. Is there a structural reason it’s missing or could support for this be added?

@brauner What’s the state of blkio on cgroup2?

I just checked again to see if anything happened the past few months but I think this is stil not supported, therefor I would still be interested to know whether there are any plans to support this and if I could perhaps help get this implemented if there are.

We’re adding a bunch of cgroup test infrastructure now so we should soon get an idea of how close to parity we are on group2 and whether the feature gap is caused by missing logic on our side (LXD/LXCFS) or lack of kernel support.

That sounds great, anywhere to follow that progress or volunteer my help?

https://github.com/lxc/lxc-ci should soon get a new test-lxd-cgroup file which will then be run daily at https://jenkins.linuxcontainers.org

We’re currently stuck on some amount of memory limit behavior in LXCFS that we want sorted out (see https://github.com/lxc/lxcfs/pull/436/files) we’ll then have that test ensure that those behave.

First target is to have the test validate every limits.* config keys we have on cgroup1, then run the test on cgroup2 and identify gaps. When it’s a LXD gap, I’ll push a quick fix as that’s easy to LXD in our lxd/cgroup package. But when it’s missing kernel functionality then those will be a problem :slight_smile:

I’m not sure how much can easily be delegated there, I suspect once we know what the kernel limitations are, pestering some of the cgroups folks into looking at those bits may be something you could do as someone who actually needs those bits? :slight_smile:

https://jenkins.linuxcontainers.org/job/lxd-test-cgroup/ is the current test for cgroup1 and cgroup1 with swapaccount.

Now to try it on a pure cgroup2 system and adapt some of the tests as needed.
This will take longer as we’ll most likely need a bunch of fixes in LXCFS and LXD as we go through those tests :slight_smile:

Thanks for the updates! I will keep an eye on the activity there will I dive deeper into the LXD code myself to see if I can figure out what exactly would be needed to support io in cgroup v2.

Ok so took some time to look into things.

I started with a Ubuntu 20.04 (although 18.04 appears to work the same) on kernel 5.4 with systemd.unified_cgroup_hierarchy=1 enabled.

This option appears to be unsupported and lxd won’t successfully read any active controllers using latest code from master. With a bit of extra code to force reading the controllers from /sys/fs/cgroup/cgroup.controllers and writing support for the new io controller in disk.go things appear to work just fine for just my use-case.

Now a couple of things

  1. It appears a lot of cgroup things stopped working when I switched to unified, since I use the cpuset controller a lot this was the most obvious but I’m sure there are others. Can I run the test suite you mentioned above to see what else broke?
  2. It’s really hard for me to test the changes I made on all the various possible cgroup configurations, is there a way I can run your Jenkins ci scripts myself somehow? I’m afraid I mostly use Gitlab CI and don’t really have a lot of experience with Jenkins.
  3. Is this something that I could make a (WIP) PR for or is there somebody on the team already dedicated to working on this? (from lxd’s point of view)

I’m not in any rush with this functionality, it’s a nice to have, so if the timing isn’t right that’s also fine for me :slight_smile:

So I’m currently at this stage with local changes:

t=2020-11-07T04:50:06+0000 lvl=info msg=" - cgroup layout: cgroup2" 
t=2020-11-07T04:50:06+0000 lvl=warn msg=" - Couldn't find the CGroup hugetlb controller, hugepage limits will be ignored" 
t=2020-11-07T04:50:06+0000 lvl=warn msg=" - Couldn't find the CGroup network priority controller, network priority will be ignored" 

Those two controllers (net_prio and hugetlb) are used by LXD for network priorities and hugepages restrictions but there doesn’t appear to be an immediate cgroup2 equivalent. It’d be good to do some digging to see if it’s because the work hasn’t been done yet or if there is another way to achieve the same result.

I sent a bunch of fixes over the weekend and included a status report in my latest PR here: https://github.com/lxc/lxd/pull/8131

Awesome, I will pull this in and test it out on my development cluster. Code looks very similar just missing the bits where it loads the unified controllers correctly so let’s see if it works. Keep you posted.

Sent a PR which implements the rest and also pushed an update to the tests to validate cgroup2, so starting tomorrow with LXD 4.8 we should be in pretty good shape.

net_prio and swappiness are the two broken things right now due to missing support in the kernel, everything else is supported and tested.

It appears everything that I use for our setup works on our development cluster using cgroup v2 and LXD 4.8 :+1: