Poor network performance on cloned vm

I’m trying to set up my own pool of Github runners using GARM with Incus with VMs.
As long as I’m using image images:ubuntu/24.04/cloud everything seems to work fine as you can see here: upstream image

But idea was to have personalized workers so I started experimenting with modifications of that upstream image by adding some packages and tuning stuff. But it turned out that modified and published image acts … strange. So I also checked not modified, just published remote image and it also has some weird networking issues.

So in short:

incus launch images:ubuntu/24.04/cloud runner-build --vm -c limits.cpu=3 -c limits.memory=7GiB

Such VMs work fine.

incus stop runner-build

incus publish runner-build --alias ghr

And now VMs based on that ghr image have various issues and simply work completely unreliable.
Same tests as above launched on new VMs: cloned image

To be honest I’m not sure what should I check to debug that issue. I tried to compare images, compare VMs, their networking profiles and everything seems to be the same… just behavior is different.

I’d be thankful for any hints what should I check.

Make sure you’re not somehow dealing with one of:

  • Duplicated IP addresses between your VMs
  • Duplicated MAC addresses between your VMs
  • Duplicated machine ID between your VMs

Any of which could cause what you’re describing.

Seems you’re right… it’s using single IP for these machines.

root@ghr:~# incus ls
+-------------------+---------+------------------------+--------------------------------------------------+-----------------+-----------+
|       NAME        |  STATE  |          IPV4          |                       IPV6                       |      TYPE       | SNAPSHOTS |
+-------------------+---------+------------------------+--------------------------------------------------+-----------------+-----------+
| garm-Ebgv1tG823Xx | RUNNING | 10.181.70.106 (enp5s0) | fd42:dfbf:8e12:9c7e:1266:6aff:fe84:b478 (enp5s0) | VIRTUAL-MACHINE | 0         |
+-------------------+---------+------------------------+--------------------------------------------------+-----------------+-----------+
| garm-h7hN3tvoS7zc | RUNNING | 172.17.0.1 (docker0)   | fd42:dfbf:8e12:9c7e:1266:6aff:fe7f:3350 (enp5s0) | VIRTUAL-MACHINE | 0         |
|                   |         | 10.181.70.106 (enp5s0) |                                                  |                 |           |
+-------------------+---------+------------------------+--------------------------------------------------+-----------------+-----------+
| garm-r2B4a6omFr1B | RUNNING | 10.181.70.106 (enp5s0) | fd42:dfbf:8e12:9c7e:1266:6aff:fe33:6f9b (enp5s0) | VIRTUAL-MACHINE | 0         |
+-------------------+---------+------------------------+--------------------------------------------------+-----------------+-----------+
| garm-zUKYcPqMVHZZ | RUNNING | 172.17.0.1 (docker0)   | fd42:dfbf:8e12:9c7e:1266:6aff:fee8:72dd (enp5s0) | VIRTUAL-MACHINE | 0         |
|                   |         | 10.181.70.106 (enp5s0) |                                                  |                 |           |
+-------------------+---------+------------------------+--------------------------------------------------+-----------------+-----------+
| garm-zXvvXuixtVvp | RUNNING | 10.181.70.248 (enp5s0) | fd42:dfbf:8e12:9c7e:1266:6aff:fe1c:e077 (enp5s0) | VIRTUAL-MACHINE | 0         |
+-------------------+---------+------------------------+--------------------------------------------------+-----------------+-----------+
| runner-build      | STOPPED |                        |                                                  | VIRTUAL-MACHINE | 0         |
+-------------------+---------+------------------------+--------------------------------------------------+-----------------+-----------+

But why?

MAC address will be generated from machine-id. Having the same machine-id in multiple VMs will result in having a bad time.

But MACs are unique and netplan is setup this way:

root@garm-zXvvXuixtVvp:~# cat /etc/netplan/50-cloud-init.yaml
network:
  version: 2
  ethernets:
    enp5s0:
      dhcp4: true

Exactly, networkd defaults to using the machine-id for DHCP leases, not the MAC.

There’s a netplan config option to force the DHCP client to use the MAC addresses.

Thanks @stgraber and @pox
Indeed some leftovers still existed from cloud-init stuff. After removing them it now works fine.

1 Like