Failed detecting root disk device

stgraber · December 1, 2020, 8:11pm

Correct, it should have been:

profiles:
 - default

danielhz · February 4, 2021, 3:33pm

I have a similar issue:

$ lxc launch images:ubuntu/focal test
Creating test
Error: Failed instance creation: Failed creating instance record: Failed initialising instance: Invalid devices: Failed detecting root disk device: No root device could be found

I have these images installed:

lxc image list
+-------+--------------+--------+---------------------------------------------+--------------+-----------+----------+-----------------------------+
| ALIAS | FINGERPRINT  | PUBLIC |                 DESCRIPTION                 | ARCHITECTURE |   TYPE    |   SIZE   |         UPLOAD DATE         |
+-------+--------------+--------+---------------------------------------------+--------------+-----------+----------+-----------------------------+
|       | 15c63eb43b36 | no     | Ubuntu focal amd64 (20210204_07:42)         | x86_64       | CONTAINER | 100.44MB | Feb 4, 2021 at 3:23pm (UTC) |
+-------+--------------+--------+---------------------------------------------+--------------+-----------+----------+-----------------------------+
|       | d1df9c150a9f | no     | ubuntu 20.04 LTS amd64 (release) (20210201) | x86_64       | CONTAINER | 358.84MB | Feb 4, 2021 at 3:19pm (UTC) |
+-------+--------------+--------+---------------------------------------------+--------------+-----------+----------+-----------------------------+

In my case the image has the default profile:

$ lxc image show 15c63eb43b36
auto_update: true
properties:
  architecture: amd64
  description: Ubuntu focal amd64 (20210204_07:42)
  os: Ubuntu
  release: focal
  serial: "20210204_07:42"
  type: squashfs
  variant: default
public: false
expires_at: 1970-01-01T01:00:00+01:00
profiles:
- default

What could be happening?

stgraber · February 4, 2021, 3:35pm

That suggests your default profile doesn’t have a root device.

danielhz · February 4, 2021, 3:52pm

Thanks! Because it was a new installation, I solved the issue reinstaling lxd and running lxd init again.

danielhz · February 20, 2021, 4:33pm

I installed lxd in another machine and I get the same error again. In this case, I have no root device in the default profile. So, how can I create a root device?

xc profile show default
config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
name: default
used_by: []

I also have a default pool storage:

lxc storage show default
config:
  source: /data/pool-storages/default
  volatile.initial_source: /data/pool-storages/default
description: ""
name: default
driver: btrfs
used_by: []
status: Created
locations:
- none

What is the command to create the root device in the default profile? I was trying to do this, but I got an error not found. I tried with the command below, but I guess I am doing something wrong.

lxc config device add default root disk path=/ pool=default

I am not sure if I am understanding what this command does. I guess that it creates a new device named root, in the profile default, and of type disk. The device will be mapped to the folder / in new containers, and a source for this device in each new container will be created in the default pool accordingly.

stgraber · February 20, 2021, 6:17pm

lxc profile device add default root disk path=/ pool=default

danielhz · February 20, 2021, 6:56pm

I think that the documentation is a bit confusing:

lxc profile device add --help
Description:
  Add devices to containers or profiles

Usage:
  lxc profile device add [<remote>:]<container|profile> <device> <type> [key=value...] [flags]

Examples:
  lxc config device add [<remote>:]container1 <device-name> disk source=/share/c1 path=opt
      Will mount the host's /share/c1 onto /opt in the container.

Global Flags:
      --debug         Show all debug messages
      --force-local   Force using the local unix socket
  -h, --help          Print help
  -v, --verbose       Show all information messages
      --version       Print version number

In the Usage section appears the keyword profile, but not in the Examples section.

stgraber · February 21, 2021, 3:51pm

You’re using an old LXD version:

stgraber@castiana:~/data/code/lxc/lxd (lxc/master)$ lxc profile device add --help
Description:
  Add instance devices

Usage:
  lxc profile device add [<remote>:]<profile> <device> <type> [key=value...] [flags]

Examples:
  lxc profile device add [<remote>:]profile1 <device-name> disk source=/share/c1 path=opt
      Will mount the host's /share/c1 onto /opt in the instance.

Global Flags:
      --debug            Show all debug messages
      --force-local      Force using the local unix socket
  -h, --help             Print help
      --project string   Override the source project
  -q, --quiet            Don't show progress information
  -v, --verbose          Show all information messages
      --version          Print version number

Ingo_Wichmann · March 6, 2021, 7:26am

lxc image edit 34aefd9c3268

and setting

profiles:
- default

worked for me.

Sur · April 26, 2021, 1:49pm

hi ,

i had the same problem and i delete the profile and copy default profile with new name and re launch the image and its work for me.

lxc profile copy default profilename

nickwalt · September 23, 2021, 4:02am

I have the same problem which appeared to occur after an lxc init failed due to the attached storage already having been created by zpool - and therefore unmanaged by lxc.

I deleted the zpool volume and recreated it using lxc init.

storage output:

root@code:/# lxc storage list
+------------+-------------+--------+------------+---------+
|    NAME    | DESCRIPTION | DRIVER |   SOURCE   | USED BY |
+------------+-------------+--------+------------+---------+
| containers |             | zfs    | containers | 0       |
+------------+-------------+--------+------------+---------+

profile output:

root@code:/# lxc profile show default
config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
name: default
used_by: []

lxc storage show containers output:

root@code:/# lxc storage show containers
config:
  source: containers
  volatile.initial_source: /dev/disk/by-id/scsi-0DO_Volume_containers
  zfs.pool_name: containers
description: ""
name: containers
driver: zfs
used_by: []
status: Created
locations:
- none

Added containers to the default profile:
root@code:/# lxc profile device add default root disk source=/mnt/containers

new profile output:

root@code:/# lxc profile show default
config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
  root:
    source: /mnt/containers
    type: disk
name: default
used_by: []

Attempted to create a new container:

root@code:/# lxc launch ubuntu:20.04 vscode
Creating vscode
Error: Failed instance creation: Failed creating instance record: Failed initialising instance: Invalid devices: Failed detecting root disk device: No root device could be found

I can’t express how frustrating it is to try and fix what should be dead simple, as in something as basic as:
lxc profile add default storage containers

LXD already knows about the storage and everything there is configured, including the device type, device location and mount point.

Why are the errors vague and without presenting any kind of overall understanding (of the system’s self knowledge of dependencies) of the configuration? Also, it let me add a device to the profile without first validating it.

The other problem is that this scenario isn’t covered directly anywhere and the documentation snippets found everywhere are for different versions of LXD. It is just a nightmare.

Also, none of the documentation fragments present how these things go together and their dependencies.

LXD is awesome but, man, are there some severe holes in the documentation. Most documentation failures occur because of the absence of relationships between technical details. Good documentation presents the system well as a sum of connected parts.

nickwalt · September 23, 2021, 6:19am

I removed root from the profile and successfully re-added it using:

root@code:/# lxc profile device add default root disk path=/ pool=containers
Device root added to default

profile output:

root@code:/# lxc profile show default
config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
  root:
    path: /
    pool: containers
    type: disk
name: default
used_by: []

A new container instance was successfully created.

So now I go looking for confirmation that it was created into the containers ZFS pool/volume. This is located at /mnt/containers on the host server.

Nothing is showing inside:

root@code:/mnt/containers# ll
total 8
drwxr-xr-x 2 root root 4096 Sep 23 02:09 ./
drwxr-xr-x 4 root root 4096 Sep 23 02:09 ../

vscode container output:

root@code:/# lxc info vscode
Name: vscode
Location: none
Remote: unix://
Architecture: x86_64
Created: 2021/09/23 05:09 UTC
Status: Running
Type: container
Profiles: default
Pid: 31875
Ips:
  eth0: inet    10.77.109.216   vethddb3c657
  eth0: inet6   fd42:185c:bcfb:88c0:216:3eff:fea3:9e1f  vethddb3c657
  eth0: inet6   fe80::216:3eff:fea3:9e1f        vethddb3c657
  lo:   inet    127.0.0.1
  lo:   inet6   ::1
Resources:
  Processes: 42
  Disk usage:
    root: 8.80MB
  CPU usage:
    CPU usage (in seconds): 15
  Memory usage:
    Memory (current): 166.04MB
    Memory (peak): 215.13MB
  Network usage:
    eth0:
      Bytes received: 27.29kB
      Bytes sent: 12.12kB
      Packets received: 63
      Packets sent: 81
    lo:
      Bytes received: 1.48kB
      Bytes sent: 1.48kB
      Packets received: 16
      Packets sent: 16
root@code:/# lxc storage show containers
config:
  source: containers
  volatile.initial_source: /dev/disk/by-id/scsi-0DO_Volume_containers
  zfs.pool_name: containers
description: ""
name: containers
driver: zfs
used_by:
- /1.0/images/a068e8daef0f88c667fd0f201ed0de1c48693ee383eeafbee6a51b79b0d29fea
- /1.0/instances/vscode
- /1.0/profiles/default
status: Created
locations:
- none

file system output at volatile.initial_source:

root@code:/dev/disk/by-id# ll
total 0
drwxr-xr-x 2 root root 100 Sep 23 02:43 ./
drwxr-xr-x 8 root root 160 Sep 23 02:43 ../
lrwxrwxrwx 1 root root   9 Sep 23 02:43 scsi-0DO_Volume_containers -> ../../sda
lrwxrwxrwx 1 root root  10 Sep 23 02:43 scsi-0DO_Volume_containers-part1 -> ../../sda1
lrwxrwxrwx 1 root root  10 Sep 23 02:43 scsi-0DO_Volume_containers-part9 -> ../../sda9

So then I check for the ZFS volume and it isn’t listed:

root@code:/mnt# df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            474M     0  474M   0% /dev
tmpfs            99M 1004K   98M   1% /run
/dev/vda1        25G  2.4G   22G  10% /
tmpfs           491M     0  491M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           491M     0  491M   0% /sys/fs/cgroup
/dev/loop0       56M   56M     0 100% /snap/core18/2066
/dev/loop1       33M   33M     0 100% /snap/snapd/11841
/dev/loop2       68M   68M     0 100% /snap/lxd/20326
/dev/vda15      105M  5.2M  100M   5% /boot/efi
tmpfs            99M     0   99M   0% /run/user/0
tmpfs           1.0M     0  1.0M   0% /var/snap/lxd/common/ns
/dev/loop3       33M   33M     0 100% /snap/snapd/13170
/dev/loop4       56M   56M     0 100% /snap/core18/2128
/dev/loop5       62M   62M     0 100% /snap/core20/1081
/dev/loop6       68M   68M     0 100% /snap/lxd/21545

I guess LXD unmounted it, but zpool still lists it which is to be expected:

root@code:/# zpool list
NAME         SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
containers  19.5G   575M  18.9G        -         -     0%     2%  1.00x    ONLINE  -

Ok, so it looks like the ZFS volume was not mounted at /mnt/containers/ and was instead mounted here:

root@code:/# zpool history containers
History for 'containers':
...
2021-09-23.05:09:31 zfs create -o mountpoint=/var/snap/lxd/common/lxd/storage-pools/containers/

Directory output:

root@code:/var/snap/lxd/common/lxd/storage-pools/containers# ll
total 36
drwx--x--x 9 root root 4096 Sep 23 02:43 ./
drwx--x--x 3 root root 4096 Sep 23 02:43 ../
drwx--x--x 3 root root 4096 Sep 23 05:10 containers/
drwx--x--x 2 root root 4096 Sep 23 02:43 containers-snapshots/
drwx--x--x 2 root root 4096 Sep 23 02:43 custom/
drwx--x--x 2 root root 4096 Sep 23 02:43 custom-snapshots/
drwx--x--x 3 root root 4096 Sep 23 05:09 images/
drwx--x--x 2 root root 4096 Sep 23 02:43 virtual-machines/
drwx--x--x 2 root root 4096 Sep 23 02:43 virtual-machines-snapshots/

I had no control over where the ZFS volume was mounted, it seems. So, it was just mounted out in some random place - in the Snap directory of all places. I mean, WTF.

But, it’s still only guesswork because zpool can’t tell me the mountpoint for the containers ZFS volume (which is a 20GB Digital Ocean block volume attached to the droplet).

tomp · September 23, 2021, 9:44am

Yes this is the correct approach:

lxc profile device add default root disk path=/ pool=containers

You can see the ZFS volumes/datasets LXD creates by running zfs list this will show which pool they are on.

As you have correctly found the storage pool mount points are not customisable and are created inside the LXD directory (which is also inside the snap mount namespace if using the snap package).

tomp · September 23, 2021, 9:44am

Its not random, it is by design. The mount point is internal to LXD so it creates it inside the LXD internal directory.

tomp · September 23, 2021, 9:47am

We also set canmount=noauto and volmode=none on the the datasets/volumes created in the pool so that they don’t appear in the host’s filesystem or /dev directory (for block volumes) by default.

Rajan_Patel · March 11, 2023, 2:38pm

This error message appears if you try to snap install LXD and skip the lxd init (or lxd init --minimal) step before launching an image (lxd launch image:22.04 containername)

lxd init solved this problem for me