Performance problem: container slower than host (x1/2)

We see performance differences between the host and the containers (which are more slow).
Using htop or iotop does not show overloaded server.

In attempt to understand better the difference, we compiled the same cpython interpreter on the host and the container.

Time output:

# host
real    2m53,592s
user    2m42,104s
sys    0m

# container
real    5m 13.93s
user    4m 40.42s
sys    0m 33.55s

Another thread shows a problem with a configred cpu limit. However, it’s not configured here:

# lxc config show container_name
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Alpine edge amd64 (20200915_13:00)
  image.os: Alpine
  image.release: edge
  image.serial: "20200915_13:00"
  image.type: squashfs
  volatile.base_image: b698b9241f4345a9d88a4e08a3719222765ced773b006358fb353fde60420b71
  volatile.eth0.host_name: veth4d4d482e
  volatile.eth0.hwaddr: 00:16:3e:34:c4:1f
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.uuid: 3b8c04c8-b897-435f-a71c-ef8e9ce846b5
devices:
  root:
    path: /
    pool: tank
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

The processors were more or less idle (checked in htop) before the compilation.
They are 8 ‘Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz’.

The lxc and lxd version are 4.18.

Known differences:

  • The host runs Debian buster (currently oldstable) and the container runs AlpineLinux.
  • The host uses an ext4 partition and the container uses a ZFS storage (named ‘tank’).

How could we understand where the difference comes or how to fix it?

1 Like

You probably should start eliminating variables. The biggest difference here is in storage and ZFS can be much slower than ext4 in some cases.

If it’s an option, I’d start by running the test both on the host and in the container using a tmpfs, that way there is no more storage bottleneck and you get to compare mostly scheduling and OS differences.

I made test in several cases. It’s the same than the previous message: the make step of Python compilation.

There are no real difference between using the disk and using tmpfs so I guess the compilation is done in memory in each cases:

  • Debian host:
ext4 tmpfs
real 2m53 2m53
user 2m42 2m42
sys 0m 0m10
  • Debian buster container:
zfs tmpfs
real 3m01 2m59
user 2m47 2m46
sys 0m13 0m12
  • Alpinelinux container:
zfs tmpfs
real 5m13 5m13
user 4m40 4m40
sys 0m33 0m32

Commands used with tmpfs:

mkdir /tmp/stephane.tmpfs
sudo mount -t tmpfs tmpfs /tmp/stephane.tmpfs -o size=1g
cp -r Python-3.7.12 /tmp/stephane.tmpfs
cd /tmp/stephane.tmpfs
./configure
time make

For the next steps, I will:

  • measure time of sync command after the compilation
  • compare kernel scheduler as you suggest
  • search if there are known speed differences between glibc (used by Debian) and musl (used by Alpinelinux)

Do you think about these steps ?

After few search, there are performance differences between the used libc. In our case, it seems the problems are not LXC related so this thread can be closed.

speed comparison
a test with rust in docker

This blog post is interesting but the wheel comparison is not relevant to the previous tests I done because compiling cpython does not use wheel packages.