Duplicate DNS entries + DHCP leases for the same VM in a cluster

elvis.chen · February 14, 2023, 10:39am

Hi LXD users,

We’re trying to setup the DNS on a lxdbr0 bridge with DHCP enabled. And then publish the DNS information to another DNS server.

Here is the config for the network zone (ip info hidden)

description: ""
config:
  peers.xxx.address: 7.7.7.7
name: pvn.lxd
used_by: []

Here is the relavant config for the network lxdbr0 (ip info hidden)

  bridge.mtu: "1500"
  dns.zone.forward: pvn.lxd
  ipv4.address:  11.11.11.0/24
  ipv4.dhcp: "true"
  ipv4.dhcp.expiry: 1d
  ipv4.dhcp.ranges: 11.11.11.11 - 11.11.11.111
  ipv4.nat: "false"
  ipv6.address: none

Since the DNS information comes from the DHCP leases, I did lxc network list-leases lxdbr0

...
| vm1 | 00:16:3e:fe:65:ac | 11.11.11.12 | DYNAMIC | lxd-cluster0|
+-----------+-------------------+--------------+---------+-----------+
| vm1 | 00:16:3e:fe:65:ac |11.11.11.12 | DYNAMIC | lxd-cluster1|
+-----------+-------------------+--------------+---------+-----------+
| vm1 | 00:16:3e:fe:65:ac | 11.11.11.12 | DYNAMIC | lxd-cluster2|
+-----------+-------------------+--------------+---------+-----------+
| vm1 | 00:16:3e:fe:65:ac | 11.11.11.12 | DYNAMIC | lxd-cluster3|
+-----------+-------------------+--------------+---------+-----------+
| vm1 | 00:16:3e:fe:65:ac | 11.11.11.12 | DYNAMIC | lxd-cluster4|
+-----------+-------------------+--------------+---------+-----------+
| vm1 | 00:16:3e:fe:65:ac | 11.11.11.12 | DYNAMIC | lxd-cluster5|
+-----------+-------------------+--------------+---------+-----------+
...

We found that the leases is repeated once for each member in the cluster. And therefore creating multiple entries in the DNS.

I’m wondering is this an expected behavior to sync the lease?

tomp · February 14, 2023, 10:54am

Please can you show lxc config show <instance> --expanded for vm1?

And also please confirm that vm1 only exists on one cluster member?

tomp · February 14, 2023, 11:29am

What version of LXD is this?

elvis.chen · February 15, 2023, 2:10am

The lxd version is 5.10.

I can make sure this instance only exists in one machine.

Here is the lxc config show -e

architecture: x86_64
config:
  image.architecture: amd64
  image.description: Ubuntu jammy amd64 (20221115_07:42)
  image.os: Ubuntu
  image.release: jammy
  image.serial: "20221115_07:42"
  image.type: disk-kvm.img
  image.variant: cloud
  limits.cpu: "10"
  limits.memory: "161061273600"
  security.secureboot: "false"
  volatile.base_image: ec5544c7adf0ec0ec4cc6fb2ad53ae0b516acbb9f23a8ba7aede3b54352e419a
  volatile.cloud-init.instance-id: f8ebe8c9-55d2-4233-9e62-9ec2ca468e79
  volatile.eth0.host_name: tapf0456408
  volatile.eth0.hwaddr: 00:16:3e:a3:7f:b6
  volatile.last_state.power: RUNNING
  volatile.uuid: 4f620e86-3e4d-4e7d-8663-11bc53a43dba
  volatile.vsock_id: "166"
devices:
  eth0:
    boot.priority: "1"
    ipv4.routes.external: 11.11.11.12/32
    name: eth0
    network: lxdbr0
    type: nic

Also we’re using netplan to manage some of the network settings. It might be helpful

# /etc/netplan/50-cloud-init.yaml
network:
  ethernets:
    enp5s0:
      dhcp4: true
      match:
        macaddress: 00:16:3e:a3:7f:b6
      mtu: 1500
      nameservers:
        addresses:
          - 8.8.8.8
        search:
          - maas
      set-name: enp5s0
  version: 2

tomp · February 15, 2023, 8:11am

Can you show me output of lxc list --all-projects please

tomp · February 15, 2023, 8:12am

OK so its possibly and issue with routes external setting.
Do you have multiple instances with the same route on their NIC out of interest?

tomp · February 15, 2023, 8:50am

Can you also confirm you’re not using the same MAC address with multiple instances?

elvis.chen · February 17, 2023, 9:49am

Sorry I can’t show you everything due to confidentiality. I can show the list of only this machine

$ lxc list --all-projects vm1 
+---------+--------+---------+----------------------+------+-----------------+-----------+------------+
| PROJECT |  NAME  |  STATE  |         IPV4         | IPV6 |      TYPE       | SNAPSHOTS | LOCATION |
+---------+--------+---------+----------------------+------+-----------------+-----------+----------+
| pvn     | vm1    | RUNNING | 11.11.11.12 (enp5s0) |      | VIRTUAL-MACHINE | 0         | lxd-cluster1|
+---------+--------+---------+----------------------+------+-----------------+-----------+----------+

No, these addresses are assigned by lxdbr0 DHCP and are unique across all projects.

Yes, this address is auto assigned as well. I double checked this using lxc list -f compact -c volatile.eth0.hwaddr --all-projects | uniq -c, which gives me all 1s. (All of our network devices are named eth0).

I’ve been able to change this by manually editing the lease file /var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases. However, it seems this will pop up again after some leases renewals.

Also I want to point out that this is not happening for all the VMs, only certain ones. And the entries may only repeat for a fraction of the cluster members. For example, 5 out of 20. However I’ve been unable to tell the difference between these machines and others.

BTW, is this possibly due to unsynced time between the cluster? There are indeed time differences between the machines in the magnitudes of seconds, or even > 10secs.