Limits.cpu.allowance doesn't exist for vms, but cpulimit does exist, bring cpu load allowance to virtual machines, too!

emilfihlman · July 13, 2024, 5:52pm

Why is there no limits.cpu.allowance for virtual machines? There is a program called cpulimit that dynamically limits resource usage (which works), and virtual machines are just qemu programs, so it would be really great to have dynamic cpu load limit for vms, especially given that otherwise virtual machines can just eat at minimum 1 core since that is the resolution available to limiting resources.

stgraber · July 13, 2024, 11:15pm

We’ve been considering adding support for some of those knobs by creating a per-VM cgroup and putting the various QEMU processes into it. This may help a bit in some cases and will certainly do the job at slowing down a VM. But the thing to keep in mind is that slowing down the VM by forcing the kernel to exit the VM context more frequently while it will slow down the VM it will also slow down the host more than it does the VM.

Those VM context switches are very expensive and that’s why for optimal performance you try to make it so the VM can effectively run without frequent interruptions. Having to exit the VM context just because of some kind of scheduling timer hitting a limit should be avoided when possible.

emilfihlman · July 15, 2024, 2:11pm

@stgraber

As a workaround could I make a container instance, install incus in it and then run a virtual machine inside the container instance? Wouldn’t this allow me to use allowance to limit the cpu load?

If you know how to do it without such a silly construct with say cgroups, please let me know.

It really is awful that by default the only knob we have rotates by 1 cpu increment for virtual machines. It means it’s not really possible to give untrusted access to a vm since they can saturate your machine easily.

stgraber · July 15, 2024, 2:37pm

You can try that, but it won’t actually give you the kind of granularity you think you’ll be getting since most of the virtualization handling is done through shared kernel threads and as mentioned this will actually degrade your host performance potentially more than if you were to just give the one CPU as you’re currently doing.

Note that Incus running inside of a container will be unable to use most storage features so you’ll basically be limited to a dir storage backend (btrfs may work too but it’s pretty awful for VMs).

emilfihlman · July 15, 2024, 4:30pm

@stgraber

Granularity? I’m thinking 1 container per 1 vm, doesn’t this allow me to set any percentage load for that container/vm combo? I’m much more thinking of like if a vm wants to saturate a thread I can say nah, max say 10% of a thread it is for that vm until it again starts to behave amicably again. I’m pretty sure that the load placed on host for this is much less than hogging that 1 thread.

The issue is that it’s not okay for one vm to hog 1 whole thread as is currently happening. I don’t have hundreds of threads here, barely more than 10.

stgraber · July 15, 2024, 4:35pm

I’ve already said it a couple of times and I’ve seen 2-3 others tell you the same thing on IRC the other day, but VMs on Linux are not just a user space process burning through CPU, a lot of it happens in kernel and putting arbitrary scheduler time limits will not provide granular CPU control in the way you think it will.

If you were running QEMU in pure CPU virtualization, that is, not use any hardware CPU virtualization feature and not use any of the kernel’s paravirtualization features, then yes, what you’re asking for would work, but this is not how modern virtual machines work.