LXD reporting stale IP address after setting static IP address

After setting an IP address on a VM and booting it backup, for like 6+ seconds after the VM is running the API returns the old IP address in the path network.enp5s0.addresses.0.address

Is this a cache issue or is this expected due to the way how Ip addresses are set for VMs?

Which API endpoint is that (when talking about the API its always useful to provide the specific request you’re making to ease in recreating it for debugging)?

I am sending a GET /1.0/instances?recursion=2

I have tried with a different approach and the same issue.

Steps to reproduce

  1. Change the static IP address of the VM
  2. Power On
  3. Get the info an state (I have tried just now making these two calls)
  • GET /instances/ubuntu
  • GET /instances/ubuntu/state

it will be hard to reproduce from the command line, since it would take you the same amount of time to type in or copy and paste.

This was not an issue with containers, hence my question.

Actually this is starting to look like a bug, the more I change the static Ip addresses on the VM the more added to the list that LXD reports when the VM boots up, it then takes seconds to clear up.

 [network] => Array
                        (
                            [eth0] => Array
                                (
                                    [addresses] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [family] => inet
                                                    [address] => 10.0.0.40
                                                    [netmask] => 24
                                                    [scope] => global
                                                )

                                            [1] => Array
                                                (
                                                    [family] => inet
                                                    [address] => 10.0.0.34
                                                    [netmask] => 24
                                                    [scope] => global
                                                )

                                            [2] => Array
                                                (
                                                    [family] => inet
                                                    [address] => 10.0.0.37
                                                    [netmask] => 24
                                                    [scope] => global
                                                )

                                            [3] => Array
                                                (
                                                    [family] => inet
                                                    [address] => 10.0.0.33
                                                    [netmask] => 24
                                                    [scope] => global
                                                )

                                            [4] => Array
                                                (
                                                    [family] => inet6
                                                    [address] => fe80::216:3eff:fe60:26ee
                                                    [netmask] => 64
                                                    [scope] => link
                                                )

                                        )

                                    [counters] => Array
                                        (
                                            [bytes_received] => 0
                                            [bytes_sent] => 86
                                            [packets_received] => 0
                                            [packets_sent] => 1
                                        )

                                    [hwaddr] => 00:16:3e:60:26:ee
                                    [host_name] => tap78e45180
                                    [mtu] => 1500
                                    [state] => up
                                    [type] => broadcast
                                )

                        )

Here it is again, but this time larger. It is noticeable after starting the VM, and then LXD fixes later on.

{
    "eth0": {
        "addresses": [
            {
                "family": "inet",
                "address": "10.0.0.50",
                "netmask": "24",
                "scope": "global"
            },
            {
                "family": "inet",
                "address": "10.0.0.34",
                "netmask": "24",
                "scope": "global"
            },
            {
                "family": "inet",
                "address": "10.0.0.40",
                "netmask": "24",
                "scope": "global"
            },
            {
                "family": "inet",
                "address": "10.0.0.37",
                "netmask": "24",
                "scope": "global"
            },
            {
                "family": "inet",
                "address": "10.0.0.33",
                "netmask": "24",
                "scope": "global"
            },
            {
                "family": "inet6",
                "address": "fe80::216:3eff:fe60:26ee",
                "netmask": "64",
                "scope": "link"
            }
        ],
        "counters": {
            "bytes_received": 0,
            "bytes_sent": 86,
            "packets_received": 0,
            "packets_sent": 1
        },
        "hwaddr": "00:16:3e:60:26:ee",
        "host_name": "tap0703e10a",
        "mtu": 1500,
        "state": "up",
        "type": "broadcast"
    }
}

I am setting the IP address by sending patch request to /instances/ubuntu

{
    "architecture": "x86_64",
    "config": {
        "image.architecture": "amd64",
        "image.description": "Ubuntu hirsute amd64 (20210621_07:42)",
        "image.os": "Ubuntu",
        "image.release": "hirsute",
        "image.serial": "20210621_07:42",
        "image.type": "disk-kvm.img",
        "image.variant": "default",
        "limits.cpu": "1",
        "limits.memory": "1GB",
        "security.secureboot": "false",
        "volatile.base_image": "35f9bd69d25246be391ec43fee156fb064e77e9e0045d73686f851e5800731c9",
        "volatile.eth0.hwaddr": "00:16:3e:60:26:ee",
        "volatile.last_state.power": "STOPPED",
        "volatile.uuid": "f917733c-be29-4ae4-ae2a-0de6304bd259"
    },
    "devices": {
        "eth0": {
            "ipv4.address": "10.0.0.99",
            "name": "eth0",
            "nictype": "bridged",
            "parent": "vnet0",
            "type": "nic"
        },
        "root": {
            "path": "/",
            "pool": "default",
            "size": "10GB",
            "type": "disk"
        }
    },
    "ephemeral": false,
    "profiles": [],
    "stateful": false,
    "description": "",
    "created_at": "2021-06-22T13:56:42.735287458+02:00",
    "expanded_config": {
        "image.architecture": "amd64",
        "image.description": "Ubuntu hirsute amd64 (20210621_07:42)",
        "image.os": "Ubuntu",
        "image.release": "hirsute",
        "image.serial": "20210621_07:42",
        "image.type": "disk-kvm.img",
        "image.variant": "default",
        "limits.cpu": "1",
        "limits.memory": "1GB",
        "security.secureboot": "false",
        "volatile.base_image": "35f9bd69d25246be391ec43fee156fb064e77e9e0045d73686f851e5800731c9",
        "volatile.eth0.hwaddr": "00:16:3e:60:26:ee",
        "volatile.last_state.power": "STOPPED",
        "volatile.uuid": "f917733c-be29-4ae4-ae2a-0de6304bd259"
    },
    "expanded_devices": {
        "eth0": {
            "ipv4.address": "10.0.0.99",
            "name": "eth0",
            "nictype": "bridged",
            "parent": "vnet0",
            "type": "nic"
        },
        "root": {
            "path": "/",
            "pool": "default",
            "size": "10GB",
            "type": "disk"
        }
    },
    "name": "ubuntu",
    "status": "Stopped",
    "status_code": 102,
    "last_used_at": "2021-06-22T15:06:15.777953116+02:00",
    "location": "none",
    "type": "virtual-machine"
} 

Ah this might be the ARP neighbour cache from the OS that LXD uses when trying to ascertain the IPs reachable in the VM guest. I’ll see if I can reproduce and confirm.

Is the the lxd-agent running inside the VM?

I have no idea, by the time I can get to a terminal (7 seconds + 4 seconds to boot) problem is resolved.

The fact the addresses are associated to the LXD device name rather than the actual interface name inside the VM (normal enp5s0) suggests the lxd-agent isn’t running, and instead LXD is relying on the IP neighbour cache go guest the IP inside the guest.

If you run: ip neigh show dev lxdbr0 do you see the same IPs associated to the MAC address of the VM? They will disappear a short time after not being in use.

How do you get into the terminal (always useful to have specifics on this forum).

So with the lxd-agent I get this in my VM test:

			"network": {
				"enp5s0": {
					"addresses": [
						{
							"address": "10.98.30.3",
							"family": "inet",
							"netmask": "24",
							"scope": "global"
						},
						{
							"address": "fd42:f402:8623:5b6b:216:3eff:fe24:d2f",
							"family": "inet6",
							"netmask": "64",
							"scope": "global"
						},
						{
							"address": "fe80::216:3eff:fe24:d2f",
							"family": "inet6",
							"netmask": "64",
							"scope": "link"
						}
					],

If I then switch into the VM using lxc shell <instance> and run systemctl stop lxd-agent, I get thrown out of the shell (expected), and then I get from the API:

			"network": {
				"eth0": {
					"addresses": [
						{
							"address": "10.98.30.3",
							"family": "inet",
							"netmask": "24",
							"scope": "global"
						},
						{
							"address": "fd42:f402:8623:5b6b:216:3eff:fe24:d2f",
							"family": "inet6",
							"netmask": "64",
							"scope": "global"
						},
						{
							"address": "fe80::216:3eff:fe24:d2f",
							"family": "inet6",
							"netmask": "64",
							"scope": "link"
						}
					],

If I then set a new IP using lxc config device override va eth0 ipv4.address=10.98.30.4 and then lxc stop va and then lxc start va, until the OS has booted and requested a new IP via DHCP, the old IP neighbour cache is used to show the possible guest IPs, and then still shows the IPs in the IP neighbour cache.

As soon as the old IPs expire from the IP neighbour cache, and either the lxd-agent starts (and reports the actual IPs inside the guest) or the guest OS makes some sort of activity on the bridge that the LXD host notices and adds to the local IP neighbour cache then the new IP is shown.

If there is a lot of fluctuation in IPs then it is likely you’ll see the old IPs until they expire or the lxd-agent starts.

This behaviour exists so that we can make an educated guess as to what IPs are inside the VM guest without using the lxd-agent (if the lxd-agent is running then this overrides those guesses). This is to support VM OSes (like Windows) that don’t have lxd-agent support.

I use the networking to determine if the instance is actually booted and ready, hence the delays i mentioned above.

The problem is for other users of the API it is giving back useless information, should gueses not be in a separate key?

The relevant sections of code are:

And

The IP information doesn’t indicate the OS is booted and ready (it could be that the network is up, or even the lxd-agent is running, but that the rest of the OS hasn’t finished booting yet, so that is not reliable), so it shouldn’t be used for that. Its informational only.

Without the lxd-agent there is no way to tell the OS state without actually calling into the VM somehow over the network via an external service (e.g. ssh perhaps).

With the lxd-agent running you could use the "processes": property in the instance state, if that is > 0 then the lxd-agent has at least started.

Regardless how I use networking, the API has been adjusted to return fake data to fix a problem with windows VMs which only LXD uses, i have to tell my users the IP address could be this, that or that or that or that. Maybe possible IPs should be elsewhere, in a guesses, or windows key. If i want to guess then i can look it up, right now, i dont want to guess, i need to determine what is the Ip address of the machine.

{
    "eth0": {

        "addresses": [
            {
                "family": "inet",
                "address": "10.0.0.50",
                "netmask": "24",
                "scope": "global"
            },
        "archive": [
            {
                "family": "inet",
                "address": "10.0.0.1234",
                "netmask": "24",
                "scope": "global"
            },
            {
                "family": "inet",
                "address": "10.0.0.34",
                "netmask": "24",
                "scope": "global"
            },
            {
                "family": "inet",
                "address": "10.0.0.40",
                "netmask": "24",
                "scope": "global"
            },
            {
                "family": "inet",
                "address": "10.0.0.37",
                "netmask": "24",
                "scope": "global"
            },
            {
                "family": "inet",
                "address": "10.0.0.33",
                "netmask": "24",
                "scope": "global"
            },
            {
                "family": "inet6",
                "address": "fe80::216:3eff:fe60:26ee",
                "netmask": "64",
                "scope": "link"
            }
        ],
        "counters": {
            "bytes_received": 0,
            "bytes_sent": 86,
            "packets_received": 0,
            "packets_sent": 1
        },
        "hwaddr": "00:16:3e:60:26:ee",
        "host_name": "tap0703e10a",
        "mtu": 1500,
        "state": "up",
        "type": "broadcast"
    }
}

@stgraber we could add a config option to disable IP educated guess mode for an instance perhaps?

I have found it is better to wait for networking to be ready, possibly people using the command line dont have this problem because by the time they type in the next command it could be running, using the API its different, things are faster because of point and click.

The processes property is a good tip, thanks for that, I am going to look how to use that, if the image is not windoz.

What advice can you give for detecting if it is windows (not tried windoz in a VM yet)? The image.os seems not to be reliable at least with alpine linux it can have different names.