Hello, I’m not sure if this is a limitation in incus or in terraform or a misunderstanding on my side…
What I want to achieve is a cap based on actual resource consumption (as opposed to resource allocation in incus), like a proper cgroup slice.
In some projects, I want to guarantee resource availability to a given instance. In some others, I don’t care how many CPUs or what RAM each instance might be using as long as the project it belongs to does not exceed certain limits. I’m not sure how to achieve the latter with Incus.
Project resource limits are indeed allocation limits.
I did consider trying to play with per-project cgroups and the like back when we were designing this stuff and it would probably have worked if we were only working standalone, but it all falls apart as soon as you consider clustering as there’s obviously no way to have a cgroup span multiple machines.
I’m also not sure that this would have provided a particularly good experience even on the single machine case. It would probably have been okay on the CPU side as things just get slower the more you use, but for memory, those kind of limits tend to be a lot more problematic as the OOM killer doesn’t always trigger quite soon enough, leading to environments where you’d be completely hammering disk or network I/O due to running out of memory in the cgroup and the kernel working around that issue by flushing all its caches and constantly having to pull the data back from disk instead.
Anyway, containers would probably have dealt with this kind of project-wide cgroup limits better as they would have been running under the same kernel and resource scheduler. VMs however would have done very poorly as QEMU wouldn’t know what to do when memory pressure builds up on the host system, even if the guest may be able to free up a whole bunch of memory, instead things would just slow down and down until the OOM killer would kill an entire VM, not a great experience.
there’s obviously no way to have a cgroup span multiple machines.
Ah yeah, I had overlooked that , especially RAM… In retrospect, it’s rather obvious and explains some design decisions by competitors. So let’s assume RAM is indeed fixed and non-overcommitable.
I still have a UX problem: with a limit of N CPUs, I can only have a maximum of N instances in a project, is this correct? That’s rather wasteful in my case, expecting most instance to idle…
I could use CPU allowance but it’s not implemented for VMs and I foresee load balancing problems on heterogeneous clusters if I specify hard limits… A limit of 1/64 core on a 64-core machine would be transformed into a 2/64 load if migrated to a 128-core machine. VMware speaks in MHz to partition CPU and allows overcommitment, which helps in my case (if that helps clarifying where I’m coming from).
Is there an existing way to overcommit CPU cores within a project and yet enforce its global limit besides running multiple VMs or containers inside an incus instance? Again, not talking about RAM or disk and not suggesting this should be enabled by default.
Edit: As for cgroup in a cluster, for CPUs only, I think limits can be changed on-the-fly for cpu, can’t they? So when a workload is migrated, their cgroup “share” can be reallocated too. If the UI is still limits.cpu, shares can be calculated out of that (e.g. Project X has a limit of 2 cores; it contains 4 instances using 1 core each; each instance has 50% of a CPU… not perfect on heterogeneous systems but more manageable than being stuck with 2 instances).