Limit RAM and CPU usage in Incus Containers

Incus Container Exceeding Set CPU and Memory Limits - How to Properly Enforce Them?

Hi everyone,

I’m running an Incus container (instance name: IVR-STG) on Ubuntu, and I’ve set CPU and memory limits via a profile, but it seems like they’re not being strictly enforced. The container is overcommitting resources based on the monitoring graphs, and I’d like advice on the correct way to limit RAM and CPU usage, as well as best practices for applying these limits effectively.

Setup Details

  • Incus version: 6.0.5
  • Host OS: Ubuntu 24.04
  • Container OS: Ubuntu 22.04 LTS

Here’s the output from incus profile list:

text

+---------+-------------------------------------------------------------+---------+
|  NAME   |                         DESCRIPTION                         | USED BY |
+---------+-------------------------------------------------------------+---------+
| Container | memory=8GiB, cpus=4, storage=30GB, pool=local, bridge=br300 | 1       |
+---------+-------------------------------------------------------------+---------+

Profile configuration (incus profile show IVR-STG):

text

config:
  limits.cpu: "4"
  limits.memory: 8GiB
description: memory=8GiB, cpus=4, storage=30GB, pool=local, bridge=bridge
devices:
  eth0:
    nictype: bridged
    parent: bridge
    type: nic
  root:
    path: /
    pool: local
    size: 30GB
    type: disk
name: container
used_by:
- /1.0/instances/Container?project=Project

Instance configuration (incus config show container):

architecture: x86_64
config:
  boot.autostart: "true"  "
  image.architecture: amd64
  image.description: ubuntu 22.04 LTS amd64 (release) (20241004)
  image.label: release
  image.os: ubuntu
  image.release: jammy
  image.serial: "20241004"
  image.type: squashfs
  image.version: "22.04"
  volatile.base_image: c15fcb01a6eb2f72e74742d69e58a44707c6a6d974451c2d6f553e83e0cacf46
  volatile.cloud-init.instance-id: 41dd2c60-0c31-45ae-b4cc-30d286bc780d
  volatile.eth0.host_name: veth609002d3
  volatile.eth0.hwaddr: 00:16:3e:e6:8e:10
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.last_state.ready: "false"
  volatile.uuid: a1eabcc9-a733-4a6c-94b2-2b77406f2aed
  volatile.uuid.generation: a1eabcc9-a733-4a6c-94b2-2b77406f2aed
devices: {}
ephemeral: false
profiles:
- container
stateful: false
description: ""

Observed Issue

The monitoring graph for the instance on grafana clearly shows overcommitment:

  • CPU Usage: Spikes up to around 200% (with user and system components), despite the limit of 4 CPUs. There are multiple peaks between 17:20 and 17:30.
  • Memory Usage: RAM/SWAP Used starts high (around 500 GiB) and drops sharply to near 0 by 17:30, with RAM Total at 512 GiB. This seems odd since the limit is only 8 GiB—could this be showing host-level stats instead of container-specific? RAM Cache, Free, and Swap Used are also tracked, with Swap Used minimal.

From what I’ve read in the docs, limits.memory should cap the container at 8 GiB, and limits.cpu=4 should restrict it to 4 cores. However, the usage appears to exceed these, possibly due to soft limits or caching/swap behavior.

Questions

  1. Is there something wrong with my configuration? Should limits be set directly on the instance instead of the profile?
  2. How can I enforce hard limits for memory (e.g., prevent overcommitment entirely) and CPU (e.g., cap total usage percentage or time slices)?
  3. What are best practices for monitoring and verifying that limits are applied? For example, commands to check from inside the container or host.
  4. Could this be related to swap priority or enforcement settings? I’ve seen mentions of limits.memory.enforce=hard and limits.memory.swap—should I add those?
  5. Any tips on optimizing resource allocation for a production setup like this.

CPU usage is correct. 4 CPUs means it should not go above 400%.

Thank you!

Am I setting the RAM and CPU limits correctly via profile?
What about the RAM in Shared picture?

I wouldn’t base my understanding of the situation on one picture. You should test your setup and measuring software. Run stress tests on your containers and measure through different means and check if they all match and if they respect your limits. If they don’t, post back here about it.

Also, if you want to show container configuration, use incus config show –expanded.

Thanks, Victor. I’ve already performed stress testing using stress-ng inside the test container, but it seems that the CPU limit isn’t being enforced.

incus config show test --expanded
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Ubuntu noble amd64 (20251006_07:42)
  image.os: Ubuntu
  image.release: noble
  image.requirements.cgroup: v2
  image.serial: "20251006_07:42"
  image.type: squashfs
  image.variant: default
  limits.cpu: "2"
  limits.cpu.allowance: 200%
  limits.memory: 2GiB
  volatile.base_image: ca9ab9dffed6b25815331f98b536f023c5c52aa9a1aeb6931c4c2062edd21b18
  volatile.cloud-init.instance-id: c3f9f719-2635-4dc6-a817-55b1e0ab5c13
  volatile.eth0.host_name: veth8f62c69c
  volatile.eth0.hwaddr: 10:66:6a:37:fa:28
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
  volatile.uuid: e488159b-e69d-48d3-90fe-75bf3d0788e3
  volatile.uuid.generation: e488159b-e69d-48d3-90fe-75bf3d0788e3
devices:
  eth0:
    nictype: bridged
    parent: br300
    type: nic
  root:
    path: /
    pool: local
    size: 30GB
    type: disk
ephemeral: false
profiles:
- test
stateful: false

Here’s what I observed:

  • The container was configured with limits.cpu=2.
  • During the stress test, both the container and host load averages increased simultaneously, which indicates that the container is consuming CPU beyond its defined limit.
  • The RAM usage is not increasing, since the CPU limit isn’t being enforced — meaning the container isn’t being throttled and system resources are freely used.
  • Ideally, a container’s CPU usage should not impact the host’s overall CPU load beyond its assigned limit if cgroup CPU constraints are properly enforced.

Example output from the container:

user@test:~# uptime
 10:20:00 up 2:32, 0 users, load average: 73.20, 20.43, 6.98

Host load Average:

Uptime
 15:19:53 up 7 days,  2:03,  9 users,  load average: 73.64, 20.85, 6.96

The host system’s load averages are almost identical during the stress test.
It appears that CPU cgroup limits aren’t being applied properly.

Do you have any suggestions on how to verify or enforce the CPU limit correctly?

My Grafana Graph During Stress Testing:

1 Like

I am also facing this in Linux containers. I tried everything but nothing working. My resources are not getting enforced, i-e, cpu.

If I were you, I’d read on what load average means. I don’t understand it so I can’t read anything from it.

From my point of view, your CPU and RAM resources are enforced from the grafana graphs. Why would you say they are not enforced when grafana is clearly showing exactly that?

Contrary to your previous grafana image, this is not just a spike, which could be some measurement error.

Thanks, Victor. My main concern is that I want to restrict my container like a VM — meaning the container’s internal load average shouldn’t impact the host’s load.

Right now, when I run a stress test inside a single container, both the container and the host show the same spike in load average, even though I’ve applied CPU limits. I’d like the container to be fully isolated so that its load doesn’t appear on the host — similar to how a virtual machine behaves.

Is there any way to achieve that kind of isolation with Incus, or is this simply a limitation of how Linux containers share the host kernel and scheduler?

I urgently need to fix this issue and would appreciate any assistance

My understanding is that you cannot hide the CPU usage of a Linux container from the host, irrespective of the implementation of the Linux container. A Linux container is effectively a process tree running on the host. If you want cleaner stats of the host, you may use some tool that is aware of containers, and has an option to disregard the CPU usage of containers.

You may hide aspects of the host from the container, and this is done with_lxcfs_ ( GitHub - lxc/lxcfs: FUSE filesystem for LXC ).