Introducing MicroCloud

Nah, the I/O thing error in devicehealth can be ignored, that’s fine.
So ceph cluster health looks good in your case.

I’m a bit confused about you having issues within lxd init as lxd init isn’t needed when using microcloud. Are you using microceph standalone instead?

Is this the issue?

 --size 0B 

Shouldn’t be. That’s our normal empty volume that we use for tracking LXD’s use.

1 Like

@stgraber sorry for confusions.
This was not microcloud, but ‘lxd init’ against microceph and got errors at the end below.

root@ggpu01:~# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What IP address or DNS name should be used to reach this server? [default=10.128.13.221]: 
Are you joining an existing cluster? (yes/no) [default=no]: 
What member name should be used to identify this server in the cluster? [default=ggpu01]: 
Setup password authentication on the cluster? (yes/no) [default=no]: 
Do you want to configure a new local storage pool? (yes/no) [default=yes]: no
Do you want to configure a new remote storage pool? (yes/no) [default=no]: yes
Name of the storage backend to use (cephobject, ceph, cephfs) [default=ceph]: 
Create a new CEPH pool? (yes/no) [default=yes]: 
Name of the existing CEPH cluster [default=ceph]: 
Name of the OSD storage pool [default=lxd]: 
Number of placement groups [default=32]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: yes
Name of the existing bridge or host interface: br0
Would you like stale cached images to be updated automatically? (yes/no) [default=yes]: 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 
Error: Failed to create storage pool "remote": Failed to run: rbd --id admin --cluster ceph --pool lxd --image-feature layering --size 0B create lxd_lxd: exit status 95 (2022-12-23T09:09:30.736+0000 7fe9be7fc700 -1 librbd::image::CreateRequest: 0x5586e61cd480 handle_add_image_to_directory: error adding image to directory: (95) Operation not supported
rbd: create error: (95) Operation not supported)

@stgraber this issue also show up after lxc storage create remote ceph using microceph and lxd separately.

Anything ceph related in journalctl -n 300 on any of those systems?

same error when setting up microcloud

Error: Failed to run: rbd --id admin --cluster ceph --pool remote --image-feature layering --size 0B create lxd_remote: exit status 95 (2022-12-26T02:21:34.190+0000 7fc12d7fa700 -1 librbd::image::CreateRequest: 0x559be7987480 handle_add_image_to_directory: error adding image to directory: (95) Operation not supported
rbd: create error: (95) Operation not supported)

nothing at storage creation time.

journalctl -n 300 output:

this is from one of the slaves

Hi
Any update on this error?
tested the release today and same results

root@microcloud01:/home/nick# snap install lxd microceph microcloud
microcloud 0+git.d78a41a from Canonical✓ installed
lxd 5.9-76c110d from Canonical✓ installed
microceph 0+git.94a7cc7 from Canonical✓ installed
root@microcloud01:/home/nick# microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.0.41]: 
Scanning for eligible servers...
Press enter to end scanning for servers
 Found "microcloud03" at "192.168.0.40"
 Found "microcloud02" at "192.168.0.39"

Ending scan
Initializing a new cluster
 Local MicroCloud is ready
 Local MicroCeph is ready
 Local LXD is ready
Awaiting cluster formation...
 Peer "microcloud03" has joined the cluster
 Peer "microcloud02" has joined the cluster
Cluster initialization is complete
Would you like to add additional local disks to MicroCeph? (yes/no) [default=yes]: 
Select from the available unpartitioned disks:

Select which disks to wipe:

Adding 3 disks to MicroCeph
Error: Failed to run: rbd --id admin --cluster ceph --pool remote --image-feature layering --size 0B create lxd_remote: exit status 95 (2023-01-04T23:11:59.366+0000 7f152cff9700 -1 librbd::image::CreateRequest: 0x5634aa3e9480 handle_add_image_to_directory: error adding image to directory: (95) Operation not supported
rbd: create error: (95) Operation not supported)
root@microcloud01:/home/nick# 

Some early issues with MicroCeph have been fixed today and both MicroCloud and MicroCeph have been updated.

If you’ve had any issues with either of the projects, please give them another try!

Specifically, this fixes issues with rbd create as well as the reported I/O error in microceph.ceph status which was related. Basically there was an issue with module loading within the OSD daemon which would prevent the creation of RBD images but would not otherwise prevent Ceph from starting up.

Thanks for the update.
Just tried it again but this time I get
I tried this twice.
in between each install i run

snap stop microcloud microceph lxd && snap disable microcloud && snap disable microceph && snap disable lxd && snap remove --purge microcloud && snap remove --purge microceph && snap remove --purge lxd

Error message during install:

Awaiting cluster formation...
Timed out waiting for a response from all cluster members
Cluster initialization is complete
Error: LXD service cluster does not match MicroCloud

Hmm, I’ve never seen that error before, very odd.

@masnax any idea what that one means?

What’s the size of your cluster?

Hi,

I just tried this on 4 KVM VMs with ubuntu server 20.04 .

$ sudo apt update && sudo apt upgrade -y
$ sudo snap install lxd microceph microcloud

was executed successfully on all VMs.

Then I got the following error initializing microcloud.

ahmad@node1:~$ sudo microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.0.201]: 
Scanning for eligible servers...
Press enter to end scanning for servers
 Found "node4" at "192.168.0.204"
 Found "node3" at "192.168.0.203"
 Found "node2" at "192.168.0.202"

Ending scan
Initializing a new cluster
Error: Failed to bootstrap local MicroCloud: Post "http://control.socket/cluster/control": dial unix /var/snap/microcloud/common/state/control.socket: connect: no such file or directory
ahmad@node1:~$ sudo ls -l /var/snap/microcloud/common/state/
total 28
-rw-r--r-- 1 root root  757 Jan  6 02:13 cluster.crt
-rw------- 1 root root  288 Jan  6 02:13 cluster.key
-rw-r--r-- 1 root root   40 Jan  6 02:16 daemon.yaml
drwx------ 2 root root 4096 Jan  6 02:17 database
-rw-r--r-- 1 root root  757 Jan  6 02:08 server.crt
-rw------- 1 root root  288 Jan  6 02:08 server.key
drwx------ 2 root root 4096 Jan  6 02:16 truststore
ahmad@node1:~$

this lab I am just doing on a local hyper-v instance
3 VMs
Each VM is set up with:

  • 8 cores
    -4gb ram
    OS disk 20gb
    ceph disk 30gb

Ill retest soon

Can you show snap services?

1 Like

I just reinstalled and it finally successfully completed!!
Thanks!

ahmad@node1:~$ snap list
Name        Version               Rev    Tracking       Publisher   Notes
core18      20221212              2667   latest/stable  canonical✓  base
core20      20221212              1778   latest/stable  canonical✓  base
core22      20221212              469    latest/stable  canonical✓  base
lxd         4.0.9-a29c6f1         24061  4.0/stable/…   canonical✓  -
microceph   0+git.00fe8d8         120    latest/stable  canonical✓  -
microcloud  0+git.d78a41a         70     latest/stable  canonical✓  -
snapd       2.58+git315.gee783cc  18180  latest/edge    canonical✓  snapd

Apologies for the very long replies. I am making them long and detailed hoping this helps improving or for documentation

To update lxd snap, I removed it and installed it again, so now I have lxd 5.9. Then I tried to initialize microclod again but it failed as the cluster was created and node1 was added.

ahmad@node1:~$ sudo snap list
Name        Version        Rev    Tracking       Publisher   Notes
core18      20210309       1997   latest/stable  canonical✓  base
core20      20221212       1778   latest/stable  canonical✓  base
core22      20221212       469    latest/stable  canonical✓  base
lxd         5.9-9879096    24175  latest/stable  canonical✓  -
microceph   0+git.00fe8d8  120    latest/stable  canonical✓  -
microcloud  0+git.d78a41a  70     latest/stable  canonical✓  -
snapd       2.49.2         11588  latest/stable  canonical✓  snapd
ahmad@node1:~$ sudo microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.0.201]: 
Scanning for eligible servers...
Press enter to end scanning for servers

Ending scan

Initializing a new cluster
Error: Failed to bootstrap local MicroCloud: Failed to initialize local remote entry: A remote with name "node1" already exists
ahmad@node1:~$

So I ended up re-creating the 4 nodes (VMs) from scratch.
Now I am having a new issue, added disks are not showing in /dev/disk/by-id/ , actually by-id directory is not there.

ahmad@node1:~$ sudo -i
root@node1:~# microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.0.201]: 
Scanning for eligible servers...
Press enter to end scanning for servers
 Found "node4" at "192.168.0.204"
 Found "node3" at "192.168.0.203"
 Found "node2" at "192.168.0.202"

Ending scan
Initializing a new cluster
 Local MicroCloud is ready
 Local MicroCeph is ready
 Local LXD is ready
Awaiting cluster formation...
 Peer "node3" has joined the cluster
 Peer "node2" has joined the cluster
 Peer "node4" has joined the cluster
Cluster initialization is complete
Would you like to add additional local disks to MicroCeph? (yes/no) [default=yes]: 
Select from the available unpartitioned disks:
Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +----------+-------+----------+--------+------------------+
       | LOCATION | MODEL | CAPACITY |  TYPE  |       PATH       |
       +----------+-------+----------+--------+------------------+
> [ ]  | node1    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [ ]  | node2    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [ ]  | node3    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [ ]  | node4    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
       +----------+-------+----------+--------+------------------+

Select which disks to wipe:
Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +----------+-------+----------+--------+------------------+
       | LOCATION | MODEL | CAPACITY |  TYPE  |       PATH       |
       +----------+-------+----------+--------+------------------+
> [x]  | node1    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [x]  | node2    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [x]  | node3    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [x]  | node4    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
       +----------+-------+----------+--------+------------------+

Adding 4 disks to MicroCeph
Error: Failed adding new disk: Invalid disk path: /dev/disk/by-id/
root@node1:~# ls -l /dev/disk/
total 0
drwxr-xr-x 2 root root  80 Jan  7 01:57 by-label
drwxr-xr-x 2 root root 100 Jan  7 01:57 by-partuuid
drwxr-xr-x 2 root root 240 Jan  7 01:57 by-path
drwxr-xr-x 2 root root  80 Jan  7 01:57 by-uuid
root@node1:~#

And the above issue was resolved by changing the added disks bus from virtio to scsi.
And finally having the below error

Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +----------+---------------+----------+------+------------------------------------------------------------+
       | LOCATION |     MODEL     | CAPACITY | TYPE |                            PATH                            |
       +----------+---------------+----------+------+------------------------------------------------------------+
> [ ]  | node1    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-1 |
  [ ]  | node2    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
  [ ]  | node3    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
  [ ]  | node4    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
       +----------+---------------+----------+------+------------------------------------------------------------+

Select which disks to wipe:
Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +----------+---------------+----------+------+------------------------------------------------------------+
       | LOCATION |     MODEL     | CAPACITY | TYPE |                            PATH                            |
       +----------+---------------+----------+------+------------------------------------------------------------+
> [x]  | node1    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-1 |
  [x]  | node2    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
  [x]  | node3    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
  [x]  | node4    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
       +----------+---------------+----------+------+------------------------------------------------------------+
Adding 4 disks to MicroCeph
Error: Failed adding new disk: Failed to bootstrap OSD: Failed to run: ceph-osd --mkfs --no-mon-config -i 1: exit status 250 (2023-01-07T02:24:02.753+0000 7fa023e645c0 -1 bluefs _replay 0x0: stop: uuid 00000000-0000-0000-0000-000000000000 != super.uuid cc2634dd-5c40-449f-b608-0a31fdaf220a, block dump:                00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*                                                                                                         00000ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001000
2023-01-07T02:24:03.645+0000 7fa023e645c0 -1 rocksdb: verify_sharding unable to list column families: NotFound: 
2023-01-07T02:24:03.645+0000 7fa023e645c0 -1 bluestore(/var/lib/ceph/osd/ceph-1) _open_db erroring opening db: 
2023-01-07T02:24:04.169+0000 7fa023e645c0 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
2023-01-07T02:24:04.169+0000 7fa023e645c0 -1  ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-1: (5) Input/output error)
root@node1:~#