External VM image: cloud-init not detecting data source

I’ve got a VM image that I built myself (using “packer” and scripts). If I run it under libvirt/kvm, and attach a cloud-init data disk, it configures itself.

I took the exact same image and imported it into incus using bin.linux.incus-migrate.x86_64. (I mean the pristine image, not the copy I booted in libvirt/kvm)

However, when I run it under incus, cloud-init doesn’t start, because no data source is detected. Fuller details are given below.

I think my questions boil down to:

  • can you think why a “generic” cloud-init image might not be able to use the metadata provided by incus?
  • is there a known problem with the “LXD” cloud-init data source?
  • if so, have the images on linuxcontainers.org been tweaked to make cloud-init work?
  • are the actual configs used to build images on linuxcontainers.org published anywhere? In distrobuilder I see examples/ubuntu.yaml, but is that the actual config being used for live, or just an example?

Thanks,

Brian.


Before starting the imported VM I’ve set cloud-init.user-data and cloud-init.network-config in the settings, using the examples from the incus documentation:

config:
  cloud-init.network-config: |
    version: 1
    config:
      - type: physical
        name: eth1
        subnets:
          - type: static
            ipv4: true
            address: 10.10.101.20
            netmask: 255.255.255.0
            gateway: 10.10.101.1
            control: auto
      - type: nameserver
        address: 10.10.10.254
  cloud-init.user-data: |
    #cloud-config
    runcmd:
      - [touch, /run/cloud.init.ran]

It boots, but there’s no mention of cloud-init in the console output. Once it’s booted, thanks to the agent I can get in with incus shell. When I investigate, I find that cloud-init was disabled, very early in the boot, because no data source was found:

root@blahblah:~# cloud-init status
status: disabled

root@blahblah:~# ls /run/cloud-init/
cloud-init-generator.log  cloud.cfg  disabled  ds-identify.log

root@blahblah:~# cat /run/cloud-init/cloud-init-generator.log
/usr/lib/systemd/system-generators/cloud-init-generator normal=/run/systemd/generator early=/run/systemd/generator.early late=/run/systemd/generator.le
checking for datasource
ds-identify rc=1
cloud-init is enabled but no datasource found, disabling
already disabled: no change needed [no /run/systemd/generator.early/multi-user.target.wants/cloud-init.target]

root@blahblah:~# cat /run/cloud-init/ds-identify.log
[up 2.97s] ds-identify
policy loaded: mode=search report=false found=all maybe=all notfound=disabled
/etc/cloud/cloud.cfg.d/90_dpkg.cfg set datasource_list: [ NoCloud, ConfigDrive, OpenNebula, DigitalOcean, Azure, AltCloud, OVF, MAAS, GCE, OpenStack, CloudSigma, SmartOS, Bigstep, Scaleway, AliYun, Ec2, CloudStack, Hetzner, IBMCloud, Oracle, Exoscale, RbxCloud, UpCloud, VMware, Vultr, LXD, NWCS, Akamai, None ]
DMI_PRODUCT_NAME=Standard PC (Q35 + ICH9, 2009)
DMI_SYS_VENDOR=QEMU
DMI_PRODUCT_SERIAL=
DMI_PRODUCT_UUID=90f2dd4a-c3bb-4d10-a0bf-c662923f9eb2
PID_1_PRODUCT_NAME=unavailable
DMI_CHASSIS_ASSET_TAG=
DMI_BOARD_NAME=Incus
FS_LABELS=UEFI,UEFI,cloudimg-rootfs
ISO9660_DEVS=
KERNEL_CMDLINE=BOOT_IMAGE=/boot/vmlinuz-5.15.0-1053-kvm root=UUID=bd03690e-006e-4898-8233-e5896fd85cac ro console=tty1 console=ttyS0
VIRT=qemu
UNAME_KERNEL_NAME=Linux
UNAME_KERNEL_RELEASE=5.15.0-1053-kvm
UNAME_KERNEL_VERSION=#58-Ubuntu SMP Tue Mar 12 12:41:48 UTC 2024
UNAME_MACHINE=x86_64
UNAME_NODENAME=blahblah
UNAME_OPERATING_SYSTEM=GNU/Linux
DSNAME=
DSLIST=NoCloud ConfigDrive OpenNebula DigitalOcean Azure AltCloud OVF MAAS GCE OpenStack CloudSigma SmartOS Bigstep Scaleway AliYun Ec2 CloudStack Hetzner IBMCloud Oracle Exoscale RbxCloud UpCloud VMware Vultr LXD NWCS Akamai None
MODE=search
ON_FOUND=all
ON_MAYBE=all
ON_NOTFOUND=disabled
pid=263 ppid=240
is_container=false
is_ds_enabled(IBMCloud) = true.
is_ds_enabled(IBMCloud) = true.
ec2 platform is 'Unknown'.
No ds found [mode=search, notfound=disabled]. Disabled cloud-init [1]
[up 3.10s] returning 1

root@blahblah:~# ls -l /dev/lxd/sock /dev/incus/sock
srw------- 1 root root 0 Mar 27 17:29 /dev/incus/sock
srw------- 1 root root 0 Mar 27 17:29 /dev/lxd/sock
root@blahblah:~#

I can see that “LXD” is in the DSLIST, but it’s not found.

However if I debug it as shown here, then LXD is detected:

root@blahblah:~# cloud-id
disabled
root@blahblah:~# DEBUG_LEVEL=2 DI_LOG=stderr /usr/lib/cloud-init/ds-identify --force
[up 481.51s] ds-identify --force
policy loaded: mode=search report=false found=all maybe=all notfound=disabled
/etc/cloud/cloud.cfg.d/90_dpkg.cfg set datasource_list: [ NoCloud, ConfigDrive, OpenNebula, DigitalOcean, Azure, AltCloud, OVF, MAAS, GCE, OpenStack, CloudSigma, SmartOS, Bigstep, Scaleway, AliYun, Ec2, CloudStack, Hetzner, IBMCloud, Oracle, Exoscale, RbxCloud, UpCloud, VMware, Vultr, LXD, NWCS, Akamai, None ]
DMI_PRODUCT_NAME=Standard PC (Q35 + ICH9, 2009)
DMI_SYS_VENDOR=QEMU
DMI_PRODUCT_SERIAL=
DMI_PRODUCT_UUID=90f2dd4a-c3bb-4d10-a0bf-c662923f9eb2
PID_1_PRODUCT_NAME=unavailable
DMI_CHASSIS_ASSET_TAG=
DMI_BOARD_NAME=Incus
FS_LABELS=UEFI,UEFI,cloudimg-rootfs
ISO9660_DEVS=
KERNEL_CMDLINE=BOOT_IMAGE=/boot/vmlinuz-5.15.0-1053-kvm root=UUID=bd03690e-006e-4898-8233-e5896fd85cac ro console=tty1 console=ttyS0
VIRT=qemu
UNAME_KERNEL_NAME=Linux
UNAME_KERNEL_RELEASE=5.15.0-1053-kvm
UNAME_KERNEL_VERSION=#58-Ubuntu SMP Tue Mar 12 12:41:48 UTC 2024
UNAME_MACHINE=x86_64
UNAME_NODENAME=blahblah
UNAME_OPERATING_SYSTEM=GNU/Linux
DSNAME=
DSLIST=NoCloud ConfigDrive OpenNebula DigitalOcean Azure AltCloud OVF MAAS GCE OpenStack CloudSigma SmartOS Bigstep Scaleway AliYun Ec2 CloudStack Hetzner IBMCloud Oracle Exoscale RbxCloud UpCloud VMware Vultr LXD NWCS Akamai None
MODE=search
ON_FOUND=all
ON_MAYBE=all
ON_NOTFOUND=disabled
pid=1328 ppid=1327
is_container=false
Checking for datasource 'NoCloud' via 'dscheck_NoCloud'
check for 'NoCloud' returned not-found[1]
Checking for datasource 'ConfigDrive' via 'dscheck_ConfigDrive'
is_ds_enabled(IBMCloud) = true.
check for 'ConfigDrive' returned not-found[1]
Checking for datasource 'OpenNebula' via 'dscheck_OpenNebula'
check for 'OpenNebula' returned not-found[1]
Checking for datasource 'DigitalOcean' via 'dscheck_DigitalOcean'
check for 'DigitalOcean' returned not-found[1]
Checking for datasource 'Azure' via 'dscheck_Azure'
check for 'Azure' returned not-found[1]
Checking for datasource 'AltCloud' via 'dscheck_AltCloud'
check for 'AltCloud' returned not-found[1]
Checking for datasource 'OVF' via 'dscheck_OVF'
check for 'OVF' returned not-found[1]
Checking for datasource 'MAAS' via 'dscheck_MAAS'
check for 'MAAS' returned not-found[1]
Checking for datasource 'GCE' via 'dscheck_GCE'
check for 'GCE' returned not-found[1]
Checking for datasource 'OpenStack' via 'dscheck_OpenStack'
is_ds_enabled(IBMCloud) = true.
check for 'OpenStack' returned not-found[1]
Checking for datasource 'CloudSigma' via 'dscheck_CloudSigma'
check for 'CloudSigma' returned not-found[1]
Checking for datasource 'SmartOS' via 'dscheck_SmartOS'
check for 'SmartOS' returned not-found[1]
Checking for datasource 'Bigstep' via 'dscheck_Bigstep'
check for 'Bigstep' returned not-found[1]
Checking for datasource 'Scaleway' via 'dscheck_Scaleway'
check for 'Scaleway' returned not-found[1]
Checking for datasource 'AliYun' via 'dscheck_AliYun'
check for 'AliYun' returned not-found[1]
Checking for datasource 'Ec2' via 'dscheck_Ec2'
ec2 platform is 'Unknown'.
check for 'Ec2' returned not-found[1]
Checking for datasource 'CloudStack' via 'dscheck_CloudStack'
check for 'CloudStack' returned not-found[1]
Checking for datasource 'Hetzner' via 'dscheck_Hetzner'
check for 'Hetzner' returned not-found[1]
Checking for datasource 'IBMCloud' via 'dscheck_IBMCloud'
ibm_provisioning=false: config '/root/provisioningConfiguration.cfg' did not exist.
check for 'IBMCloud' returned not-found[1]
Checking for datasource 'Oracle' via 'dscheck_Oracle'
check for 'Oracle' returned not-found[1]
Checking for datasource 'Exoscale' via 'dscheck_Exoscale'
check for 'Exoscale' returned not-found[1]
Checking for datasource 'RbxCloud' via 'dscheck_RbxCloud'
check for 'RbxCloud' returned not-found[1]
Checking for datasource 'UpCloud' via 'dscheck_UpCloud'
check for 'UpCloud' returned not-found[1]
Checking for datasource 'VMware' via 'dscheck_VMware'
check for 'VMware' returned not-found[1]
Checking for datasource 'Vultr' via 'dscheck_Vultr'
check for 'Vultr' returned not-found[1]
Checking for datasource 'LXD' via 'dscheck_LXD'
check for 'LXD' returned found
Checking for datasource 'NWCS' via 'dscheck_NWCS'
check for 'NWCS' returned not-found[1]
Checking for datasource 'Akamai' via 'dscheck_Akamai'
check for 'Akamai' returned not-found[1]
Checking for datasource 'None' via 'dscheck_None'
check for 'None' returned not-found[1]
found=LXD maybe=
Found single datasource: LXD
[up 481.60s] returning 0

systemctl list-dependencies doesn’t show any cloud-init components.

I found /lib/systemd/system/cloud-init-local.service. None of the conditions which might stop it running appear to be true:

root@blahblah:~# ls -l /etc/cloud/cloud-init.disabled
ls: cannot access '/etc/cloud/cloud-init.disabled': No such file or directory
root@blahblah:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.15.0-1053-kvm root=UUID=bd03690e-006e-4898-8233-e5896fd85cac ro console=tty1 console=ttyS0

However, the unit doesn’t show any information, and AFAICS isn’t triggered:

root@blahblah:~# systemctl status cloud-init-local.service
○ cloud-init-local.service - Initial cloud-init job (pre-networking)
     Loaded: loaded (/lib/systemd/system/cloud-init-local.service; enabled; vendor preset: enabled)
     Active: inactive (dead)

I can kick it by hand with systemctl start cloud-init-local and then cloud-init starts (e.g. /etc/netplan/50-cloud-init.yaml is created)

If I do cloud-init clean --machine-id -s -c all and then reboot, it still doesn’t detect the LXD data source.


For comparison, I tried one of the linuxcontainer.org images.

$ incus init images:ubuntu/22.04/cloud --vm
Creating the instance
Instance name is: legible-lemur

$ incus config edit legible-lemur
... set cloud-init.* as above

$ incus start --console legible-lemur
To detach from the console, press: <ctrl>+a q
BdsDxe: loading Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)
BdsDxe: starting Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)
[    3.137368] systemd-udevd[268]: id: Process 'dmi_memory_id' terminated by signal TERM.
[    3.138873] systemd-udevd[268]: id: Failed to wait for spawned command 'dmi_memory_id': Input/output error
[    3.140317] systemd-udevd[268]: id: /usr/lib/udev/rules.d/70-memory.rules:6 Failed to execute 'dmi_memory_id', ignoring: Input/output error
[    3.144342] virtio_gpu virtio9: [drm] drm_plane_enable_fb_damage_clips() not called
[    3.252762] reboot: Restarting system
Error: write unix:@->/run/incus/legible-lemur/qemu.console: broken pipe

Bleurgh! But actually it did boot, and I can login to it, and cloud-init has run.

When I check the status, I see it used the “NoCloud” data source, not “LXD”:

root@legible-lemur:~# cat /run/cloud-init/ds-identify.log
[up 1.83s] ds-identify
policy loaded: mode=search report=false found=all maybe=all notfound=disabled
/etc/cloud/cloud.cfg.d/90_dpkg.cfg set datasource_list: [ NoCloud, ConfigDrive, OpenNebula, DigitalOcean, Azure, AltCloud, OVF, MAAS, GCE, OpenStack, ]
DMI_PRODUCT_NAME=Standard PC (Q35 + ICH9, 2009)
DMI_SYS_VENDOR=QEMU
DMI_PRODUCT_SERIAL=
DMI_PRODUCT_UUID=e3f8ba75-5893-4e1b-a842-0e943f209ae5
PID_1_PRODUCT_NAME=unavailable
DMI_CHASSIS_ASSET_TAG=
DMI_BOARD_NAME=Incus
FS_LABELS=rootfs,UEFI,UEFI
ISO9660_DEVS=
KERNEL_CMDLINE=BOOT_IMAGE=/boot/vmlinuz-5.15.0-101-generic root=/dev/sda2 ro quiet splash console=tty1 console=ttyS0
VIRT=qemu
UNAME_KERNEL_NAME=Linux
UNAME_KERNEL_RELEASE=5.15.0-101-generic
UNAME_KERNEL_VERSION=#111-Ubuntu SMP Tue Mar 5 20:16:58 UTC 2024
UNAME_MACHINE=x86_64
UNAME_NODENAME=legible-lemur
UNAME_OPERATING_SYSTEM=GNU/Linux
DSNAME=
DSLIST=NoCloud ConfigDrive OpenNebula DigitalOcean Azure AltCloud OVF MAAS GCE OpenStack CloudSigma SmartOS Bigstep Scaleway AliYun Ec2 CloudStack Hete
MODE=search
ON_FOUND=all
ON_MAYBE=all
ON_NOTFOUND=disabled
pid=223 ppid=207
is_container=false
check for 'NoCloud' returned found
is_ds_enabled(IBMCloud) = true.
is_ds_enabled(IBMCloud) = true.
ec2 platform is 'Unknown'.
Found single datasource: NoCloud
[up 1.88s] returning 0

and logs show the configs were picked up from files under /var/lib/cloud/seed/nocloud-net:

root@legible-lemur:~# less /var/log/cloud-init.log
...
2024-03-27 18:05:39,908 - util.py[DEBUG]: Read 55 bytes from /var/lib/cloud/seed/nocloud-net/user-data
2024-03-27 18:05:39,908 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud-net/meta-data (quiet=False)
2024-03-27 18:05:39,908 - util.py[DEBUG]: Read 58 bytes from /var/lib/cloud/seed/nocloud-net/meta-data
2024-03-27 18:05:39,908 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud-net/vendor-data (quiet=False)
2024-03-27 18:05:39,909 - util.py[DEBUG]: Read 17 bytes from /var/lib/cloud/seed/nocloud-net/vendor-data
2024-03-27 18:05:39,909 - util.py[DEBUG]: Reading from /var/lib/cloud/seed/nocloud-net/network-config (quiet=False)
2024-03-27 18:05:39,910 - util.py[DEBUG]: Read 246 bytes from /var/lib/cloud/seed/nocloud-net/network-config

root@legible-lemur:~# ls /var/lib/cloud/seed/nocloud-net/
meta-data  network-config  user-data  vendor-data

How those files were made available to the VM by incus, I don’t know.

A thought: this particular VM happens to be using btrfs as its filesystem. Is it possible that incus is stuffing files into /var/lib/cloud/seed/nocloud-net/* by modifying the filesystem directly - and that it can’t do this with btrfs?

To get cloud-init to behave with Incus you need either:

  • incus-agent setup in the VM
  • cloud-init cdrom drive added to the VM

In the agent case, the agent will fetch the cloud-init stuff from the host system and put them in NoCloud on the filesystem. If there’s no agent, then a source=cloud-init:config disk attached to the VM should get detected by cloud-init and used for metadata.

Thank you! Maybe worth adding that info here?

The information is currently buried under Instances > Instance Configuration > Devices > Type: disk

The magic line:

incus config device add blahblah cloud-init disk source=cloud-init:config

I guess the “lxd” data source is only used for containers, or if the agent is present?

Cheers,

Brian.

Correct, the LXD source requires /dev/lxd/socket which is also something that you need the agent to provide.

1 Like

Thanks for the doc update. And I found the imagebuilder config at lxc-ci/images/ubuntu.yaml at main · lxc/lxc-ci · GitHub

- name: incus-agent
  generator: incus-agent
  types:
  - vm