Introducing MicroCloud

stgraber · January 5, 2023, 12:43am

Some early issues with MicroCeph have been fixed today and both MicroCloud and MicroCeph have been updated.

If you’ve had any issues with either of the projects, please give them another try!

Specifically, this fixes issues with rbd create as well as the reported I/O error in microceph.ceph status which was related. Basically there was an issue with module loading within the OSD daemon which would prevent the creation of RBD images but would not otherwise prevent Ceph from starting up.

nkrapf · January 5, 2023, 12:59am

Thanks for the update.
Just tried it again but this time I get
I tried this twice.
in between each install i run

snap stop microcloud microceph lxd && snap disable microcloud && snap disable microceph && snap disable lxd && snap remove --purge microcloud && snap remove --purge microceph && snap remove --purge lxd

Error message during install:

Awaiting cluster formation...
Timed out waiting for a response from all cluster members
Cluster initialization is complete
Error: LXD service cluster does not match MicroCloud

stgraber · January 5, 2023, 1:42am

Hmm, I’ve never seen that error before, very odd.

@masnax any idea what that one means?

stgraber · January 5, 2023, 1:43am

What’s the size of your cluster?

ahmadqz · January 6, 2023, 2:23am

Hi,

I just tried this on 4 KVM VMs with ubuntu server 20.04 .

$ sudo apt update && sudo apt upgrade -y
$ sudo snap install lxd microceph microcloud

was executed successfully on all VMs.

Then I got the following error initializing microcloud.

ahmad@node1:~$ sudo microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.0.201]: 
Scanning for eligible servers...
Press enter to end scanning for servers
 Found "node4" at "192.168.0.204"
 Found "node3" at "192.168.0.203"
 Found "node2" at "192.168.0.202"

Ending scan
Initializing a new cluster
Error: Failed to bootstrap local MicroCloud: Post "http://control.socket/cluster/control": dial unix /var/snap/microcloud/common/state/control.socket: connect: no such file or directory
ahmad@node1:~$ sudo ls -l /var/snap/microcloud/common/state/
total 28
-rw-r--r-- 1 root root  757 Jan  6 02:13 cluster.crt
-rw------- 1 root root  288 Jan  6 02:13 cluster.key
-rw-r--r-- 1 root root   40 Jan  6 02:16 daemon.yaml
drwx------ 2 root root 4096 Jan  6 02:17 database
-rw-r--r-- 1 root root  757 Jan  6 02:08 server.crt
-rw------- 1 root root  288 Jan  6 02:08 server.key
drwx------ 2 root root 4096 Jan  6 02:16 truststore
ahmad@node1:~$

nkrapf · January 6, 2023, 2:34am

this lab I am just doing on a local hyper-v instance
3 VMs
Each VM is set up with:

8 cores
-4gb ram
OS disk 20gb
ceph disk 30gb

Ill retest soon

stgraber · January 6, 2023, 5:51am

Can you show snap services?

nkrapf · January 6, 2023, 6:04am

I just reinstalled and it finally successfully completed!!
Thanks!

ahmadqz · January 6, 2023, 11:01am

ahmad@node1:~$ snap list
Name        Version               Rev    Tracking       Publisher   Notes
core18      20221212              2667   latest/stable  canonical✓  base
core20      20221212              1778   latest/stable  canonical✓  base
core22      20221212              469    latest/stable  canonical✓  base
lxd         4.0.9-a29c6f1         24061  4.0/stable/…   canonical✓  -
microceph   0+git.00fe8d8         120    latest/stable  canonical✓  -
microcloud  0+git.d78a41a         70     latest/stable  canonical✓  -
snapd       2.58+git315.gee783cc  18180  latest/edge    canonical✓  snapd

ahmadqz · January 7, 2023, 1:02am

Apologies for the very long replies. I am making them long and detailed hoping this helps improving or for documentation

To update lxd snap, I removed it and installed it again, so now I have lxd 5.9. Then I tried to initialize microclod again but it failed as the cluster was created and node1 was added.

ahmad@node1:~$ sudo snap list
Name        Version        Rev    Tracking       Publisher   Notes
core18      20210309       1997   latest/stable  canonical✓  base
core20      20221212       1778   latest/stable  canonical✓  base
core22      20221212       469    latest/stable  canonical✓  base
lxd         5.9-9879096    24175  latest/stable  canonical✓  -
microceph   0+git.00fe8d8  120    latest/stable  canonical✓  -
microcloud  0+git.d78a41a  70     latest/stable  canonical✓  -
snapd       2.49.2         11588  latest/stable  canonical✓  snapd
ahmad@node1:~$ sudo microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.0.201]: 
Scanning for eligible servers...
Press enter to end scanning for servers

Ending scan

Initializing a new cluster
Error: Failed to bootstrap local MicroCloud: Failed to initialize local remote entry: A remote with name "node1" already exists
ahmad@node1:~$

So I ended up re-creating the 4 nodes (VMs) from scratch.
Now I am having a new issue, added disks are not showing in /dev/disk/by-id/ , actually by-id directory is not there.

ahmad@node1:~$ sudo -i
root@node1:~# microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.0.201]: 
Scanning for eligible servers...
Press enter to end scanning for servers
 Found "node4" at "192.168.0.204"
 Found "node3" at "192.168.0.203"
 Found "node2" at "192.168.0.202"

Ending scan
Initializing a new cluster
 Local MicroCloud is ready
 Local MicroCeph is ready
 Local LXD is ready
Awaiting cluster formation...
 Peer "node3" has joined the cluster
 Peer "node2" has joined the cluster
 Peer "node4" has joined the cluster
Cluster initialization is complete
Would you like to add additional local disks to MicroCeph? (yes/no) [default=yes]: 
Select from the available unpartitioned disks:
Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +----------+-------+----------+--------+------------------+
       | LOCATION | MODEL | CAPACITY |  TYPE  |       PATH       |
       +----------+-------+----------+--------+------------------+
> [ ]  | node1    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [ ]  | node2    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [ ]  | node3    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [ ]  | node4    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
       +----------+-------+----------+--------+------------------+

Select which disks to wipe:
Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +----------+-------+----------+--------+------------------+
       | LOCATION | MODEL | CAPACITY |  TYPE  |       PATH       |
       +----------+-------+----------+--------+------------------+
> [x]  | node1    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [x]  | node2    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [x]  | node3    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
  [x]  | node4    |       | 50.00GiB | virtio | /dev/disk/by-id/ |
       +----------+-------+----------+--------+------------------+

Adding 4 disks to MicroCeph
Error: Failed adding new disk: Invalid disk path: /dev/disk/by-id/
root@node1:~# ls -l /dev/disk/
total 0
drwxr-xr-x 2 root root  80 Jan  7 01:57 by-label
drwxr-xr-x 2 root root 100 Jan  7 01:57 by-partuuid
drwxr-xr-x 2 root root 240 Jan  7 01:57 by-path
drwxr-xr-x 2 root root  80 Jan  7 01:57 by-uuid
root@node1:~#

And the above issue was resolved by changing the added disks bus from virtio to scsi.
And finally having the below error

Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +----------+---------------+----------+------+------------------------------------------------------------+
       | LOCATION |     MODEL     | CAPACITY | TYPE |                            PATH                            |
       +----------+---------------+----------+------+------------------------------------------------------------+
> [ ]  | node1    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-1 |
  [ ]  | node2    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
  [ ]  | node3    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
  [ ]  | node4    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
       +----------+---------------+----------+------+------------------------------------------------------------+

Select which disks to wipe:
Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +----------+---------------+----------+------+------------------------------------------------------------+
       | LOCATION |     MODEL     | CAPACITY | TYPE |                            PATH                            |
       +----------+---------------+----------+------+------------------------------------------------------------+
> [x]  | node1    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-1 |
  [x]  | node2    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
  [x]  | node3    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
  [x]  | node4    | QEMU HARDDISK | 50.00GiB | scsi | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0 |
       +----------+---------------+----------+------+------------------------------------------------------------+
Adding 4 disks to MicroCeph
Error: Failed adding new disk: Failed to bootstrap OSD: Failed to run: ceph-osd --mkfs --no-mon-config -i 1: exit status 250 (2023-01-07T02:24:02.753+0000 7fa023e645c0 -1 bluefs _replay 0x0: stop: uuid 00000000-0000-0000-0000-000000000000 != super.uuid cc2634dd-5c40-449f-b608-0a31fdaf220a, block dump:                00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*                                                                                                         00000ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001000
2023-01-07T02:24:03.645+0000 7fa023e645c0 -1 rocksdb: verify_sharding unable to list column families: NotFound: 
2023-01-07T02:24:03.645+0000 7fa023e645c0 -1 bluestore(/var/lib/ceph/osd/ceph-1) _open_db erroring opening db: 
2023-01-07T02:24:04.169+0000 7fa023e645c0 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
2023-01-07T02:24:04.169+0000 7fa023e645c0 -1  ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-1: (5) Input/output error)
root@node1:~#

nkrapf · January 7, 2023, 1:37am

run this on all nodes to remove everything
–if needed break apart the command into individual commands.

snap stop microcloud microceph lxd && snap disable microcloud && snap disable microceph && snap disable lxd && snap remove --purge microcloud && snap remove --purge microceph && snap remove --purge lxd

bolapara · January 14, 2023, 9:16pm

I’m running into the same issue as nkrapf with the timeout waiting for the cluster nodes to join. I’ve stopped/purged several time without change. I’ve blown away the VMs and recreated them without change.

Cluster is 3 nodes running as VMs under Proxmox VE 7.3. Nodes have 16GB RAM, 32GB OS disks, and 1x 50GB OSD disks.

After everything is setup and installed I run microcloud init:

bolapara@clustertest03:~$ sudo microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.86.177]: 
Scanning for eligible servers...
Press enter to end scanning for servers
 Found "clustertest04" at "192.168.86.178"
 Found "clustertest05" at "192.168.86.179"

Ending scan
Initializing a new cluster
 Local MicroCloud is ready
 Local LXD is ready
 Local MicroCeph is ready
Awaiting cluster formation...
Timed out waiting for a response from all cluster members
Cluster initialization is complete
Would you like to add additional local disks to MicroCeph? (yes/no) [default=yes]: no
MicroCloud is ready
bolapara@clustertest03:~$ sudo microcloud cluster list
+---------------+---------------------+---------+------------------------------------------------------------------+-------------+
|     NAME      |       ADDRESS       |  ROLE   |                           FINGERPRINT                            |   STATUS    |
+---------------+---------------------+---------+------------------------------------------------------------------+-------------+
| clustertest03 | 192.168.86.177:9443 | PENDING | <snip> | UNREACHABLE |
+---------------+---------------------+---------+------------------------------------------------------------------+-------------+
bolapara@clustertest03:~$

All the other nodes say they are not initialized:

bolapara@clustertest04:~$ sudo microcloud cluster list
Error: Daemon not yet initialized
bolapara@clustertest04:~$

Not sure if I should file an issue on the github project or report here?

stgraber · January 15, 2023, 4:08am

@masnax any idea what we’d need to further debug this?

bolapara · January 15, 2023, 4:45am

System logs show both microceph and microcloud daemons reporting “no available dqlite leader server found”.

Both microcloud verbose and debug options don’t provide any more information.

VMs are Jammy cloud-init images. I’m using ansible to set the VMs up so it’s very repeatable and quick to iterate if you have things you want me to try.

bolapara · January 15, 2023, 5:08am

Maybe useful log message from one of the nodes that were intended to join:

Jan 14 22:32:32 clustertest04 microcloud.daemon[2987]: time="2023-01-14T22:32:32-06:00" level=error msg="Failed to parse join token" error="Failed to parse token map: invalid character 'r' looking for beginning of value" name=clustertest04

masnax · January 16, 2023, 6:15pm

This is an odd one, looks like dqlite isn’t properly starting on clustertest03. It’s reporting as UNREACHABLE with pending status which means it never got fully set up.

The log in your second post indicates malformed data is being sent from clustertest03 to clustertest04, which would make sense given the above situation.

If you could provide the logs for the initial node, that would be a big help.

bolapara · January 16, 2023, 9:05pm

microcloud init was initiated on clustertest03. here are those logs.

Jan 16 14:32:33 clustertest03 python3[4304]: ansible-community.general.timezone Invoked with name=US/Central hwclock=None
Jan 16 14:32:33 clustertest03 dbus-daemon[622]: [system] Activating via systemd: service name='org.freedesktop.timedate1' unit='dbus-org.freedesktop.timedate1.service' requested by ':1.34' (uid=0 pid=4305 comm="/usr/bin/timedatectl " label="unconfined")
Jan 16 14:32:33 clustertest03 systemd[1]: Starting Time & Date Service...
Jan 16 14:32:33 clustertest03 dbus-daemon[622]: [system] Successfully activated service 'org.freedesktop.timedate1'
Jan 16 14:32:33 clustertest03 systemd[1]: Started Time & Date Service.
Jan 16 14:32:33 clustertest03 sudo[4301]: pam_unix(sudo:session): session closed for user root
Jan 16 14:32:33 clustertest03 sudo[4328]:     josh : TTY=pts/3 ; PWD=/home/josh ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-tuhjjbttnhxcdpqtfoolssxiyzuhxpqb ; /usr/bin/python3 /home/josh/.ansible/tmp/ansible-tmp-1673901153.3981993-7797-34717088270063/AnsiballZ_systemd.py
Jan 16 14:32:33 clustertest03 sudo[4328]: pam_unix(sudo:session): session opened for user root(uid=0) by josh(uid=1000)
Jan 16 14:32:34 clustertest03 python3[4331]: ansible-ansible.legacy.systemd Invoked with name=apparmor state=stopped enabled=False daemon_reload=False daemon_reexec=False scope=system no_block=False force=None masked=None
Jan 16 14:32:34 clustertest03 sudo[4328]: pam_unix(sudo:session): session closed for user root
Jan 16 14:33:02 clustertest03 systemd[1]: systemd-hostnamed.service: Deactivated successfully.
Jan 16 14:33:03 clustertest03 systemd[1]: systemd-timedated.service: Deactivated successfully.
Jan 16 14:33:09 clustertest03 sudo[4347]:     josh : TTY=pts/2 ; PWD=/home/josh ; USER=root ; COMMAND=/snap/bin/microcloud init
Jan 16 14:33:09 clustertest03 sudo[4347]: pam_unix(sudo:session): session opened for user root(uid=0) by josh(uid=1000)
Jan 16 14:33:09 clustertest03 systemd[1]: Started snap.microcloud.microcloud.ba172086-72f6-48d9-b0d4-f47eeaf5783a.scope.
Jan 16 14:33:17 clustertest03 systemd[1]: Started Service for snap application lxd.daemon.
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: => Preparing the system (24175)
Jan 16 14:33:17 clustertest03 systemd[1]: Started /bin/true.
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Loading snap configuration
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Creating /var/snap/lxd/common/lxd/logs
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Creating /var/snap/lxd/common/global-conf
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Setting up mntns symlink (mnt:[4026532233])
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Setting up mount propagation on /var/snap/lxd/common/lxd/storage-pools
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Setting up mount propagation on /var/snap/lxd/common/lxd/devices
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Setting up persistent shmounts path
Jan 16 14:33:17 clustertest03 systemd[1]: var-snap-lxd-common-shmounts.mount: Deactivated successfully.
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ====> Making LXD shmounts use the persistent path
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ====> Making LXCFS use the persistent path
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Setting up kmod wrapper
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Preparing /boot
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Preparing a clean copy of /run
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Preparing /run/bin
Jan 16 14:33:17 clustertest03 lxd.daemon[4382]: ==> Preparing a clean copy of /etc
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Preparing a clean copy of /usr/share/misc
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Setting up ceph configuration
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Setting up LVM configuration
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Setting up OVN configuration
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Rotating logs
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Setting up ZFS (2.1)
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Escaping the systemd cgroups
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ====> Detected cgroup V2
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Escaping the systemd process resource limits
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Increasing the number of inotify user instances
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Increasing the number of keys for a nonroot user
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Increasing the number of bytes for a nonroot user
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: ==> Disabling shiftfs on this kernel (auto)
Jan 16 14:33:18 clustertest03 lxd.daemon[4382]: => Starting LXCFS
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: Running constructor lxcfs_init to reload liblxcfs
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: mount namespace: 5
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: hierarchies:
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]:   0: fd:   6: cpuset,cpu,io,memory,hugetlb,pids,rdma,misc
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: Kernel supports pidfds
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: Kernel does not support swap accounting
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: api_extensions:
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - cgroups
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - sys_cpu_online
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - proc_cpuinfo
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - proc_diskstats
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - proc_loadavg
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - proc_meminfo
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - proc_stat
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - proc_swaps
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - proc_uptime
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - proc_slabinfo
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - shared_pidns
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - cpuview_daemon
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - loadavg_daemon
Jan 16 14:33:18 clustertest03 lxd.daemon[4552]: - pidfds
Jan 16 14:33:19 clustertest03 lxd.daemon[4382]: => Starting LXD
Jan 16 14:33:21 clustertest03 systemd[1]: Reloading.
Jan 16 14:33:22 clustertest03 systemd[1]: Started Service for snap application microceph.mon.
Jan 16 14:33:22 clustertest03 kernel: NET: Registered PF_VSOCK protocol family
Jan 16 14:33:22 clustertest03 lxd.daemon[4597]: time="2023-01-16T14:33:22-06:00" level=warning msg=" - Couldn't find the CGroup network priority controller, network priority will be ignored"
Jan 16 14:33:23 clustertest03 microcloud.daemon[4184]: time="2023-01-16T14:33:23-06:00" level=error msg="Failed to initiate heartbeat round" address="192.168.86.177:9443" error="no available dqlite leader server found"
Jan 16 14:33:23 clustertest03 microceph.daemon[3565]: time="2023-01-16T14:33:23-06:00" level=error msg="Failed to initiate heartbeat round" address="192.168.86.177:7443" error="Service Unavailable"
Jan 16 14:33:25 clustertest03 systemd[1]: Reloading.
Jan 16 14:33:25 clustertest03 systemd[1]: Started Service for snap application microceph.mgr.
Jan 16 14:33:25 clustertest03 kernel: bpfilter: Loaded bpfilter_umh pid 4834
Jan 16 14:33:25 clustertest03 unknown: Started bpfilter
Jan 16 14:33:25 clustertest03 kernel: spl: loading out-of-tree module taints kernel.
Jan 16 14:33:26 clustertest03 kernel: znvpair: module license 'CDDL' taints kernel.
Jan 16 14:33:26 clustertest03 kernel: Disabling lock debugging due to kernel taint
Jan 16 14:33:26 clustertest03 kernel: ZFS: Loaded module v2.1.4-0ubuntu0.1, ZFS pool version 5000, ZFS filesystem version 5
Jan 16 14:33:26 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:26.621-0600 7f900d96ddc0 -1 mgr[py] Module alerts has missing NOTIFY_TYPES member
Jan 16 14:33:26 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:26.761-0600 7f900d96ddc0 -1 mgr[py] Module balancer has missing NOTIFY_TYPES member
Jan 16 14:33:27 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:26.993-0600 7f900d96ddc0 -1 mgr[py] Module crash has missing NOTIFY_TYPES member
Jan 16 14:33:27 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:27.097-0600 7f900d96ddc0 -1 mgr[py] Module devicehealth has missing NOTIFY_TYPES member
Jan 16 14:33:27 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:27.201-0600 7f900d96ddc0 -1 mgr[py] Module influx has missing NOTIFY_TYPES member
Jan 16 14:33:27 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:27.401-0600 7f900d96ddc0 -1 mgr[py] Module iostat has missing NOTIFY_TYPES member
Jan 16 14:33:27 clustertest03 lxd.daemon[4382]: => First LXD execution on this system
Jan 16 14:33:27 clustertest03 lxd.daemon[4382]: => LXD is ready
Jan 16 14:33:27 clustertest03 systemd[1]: Reloading.
Jan 16 14:33:27 clustertest03 kernel: bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
Jan 16 14:33:27 clustertest03 networkd-dispatcher[630]: WARNING:Unknown index 3 seen, reloading interface list
Jan 16 14:33:27 clustertest03 systemd-udevd[4808]: Using default interface naming scheme 'v249'.
Jan 16 14:33:27 clustertest03 kernel: lxdfan0: port 1(lxdfan0-mtu) entered blocking state
Jan 16 14:33:27 clustertest03 kernel: lxdfan0: port 1(lxdfan0-mtu) entered disabled state
Jan 16 14:33:27 clustertest03 kernel: device lxdfan0-mtu entered promiscuous mode
Jan 16 14:33:27 clustertest03 kernel: lxdfan0: port 1(lxdfan0-mtu) entered blocking state
Jan 16 14:33:27 clustertest03 kernel: lxdfan0: port 1(lxdfan0-mtu) entered forwarding state
Jan 16 14:33:27 clustertest03 systemd-networkd[583]: lxdfan0-mtu: Link UP
Jan 16 14:33:27 clustertest03 systemd-networkd[583]: lxdfan0-mtu: Gained carrier
Jan 16 14:33:27 clustertest03 systemd-networkd[583]: lxdfan0-mtu: Gained IPv6LL
Jan 16 14:33:27 clustertest03 systemd-networkd[583]: lxdfan0: Link UP
Jan 16 14:33:27 clustertest03 systemd-udevd[4944]: Using default interface naming scheme 'v249'.
Jan 16 14:33:27 clustertest03 avahi-daemon[620]: Joining mDNS multicast group on interface lxdfan0.IPv4 with address 240.177.0.1.
Jan 16 14:33:27 clustertest03 avahi-daemon[620]: New relevant interface lxdfan0.IPv4 for mDNS.
Jan 16 14:33:27 clustertest03 avahi-daemon[620]: Registering new address record for 240.177.0.1 on lxdfan0.IPv4.
Jan 16 14:33:27 clustertest03 networkd-dispatcher[630]: WARNING:Unknown index 5 seen, reloading interface list
Jan 16 14:33:27 clustertest03 kernel: lxdfan0: port 2(lxdfan0-fan) entered blocking state
Jan 16 14:33:27 clustertest03 kernel: lxdfan0: port 2(lxdfan0-fan) entered disabled state
Jan 16 14:33:27 clustertest03 kernel: device lxdfan0-fan entered promiscuous mode
Jan 16 14:33:27 clustertest03 systemd-networkd[583]: lxdfan0-fan: Link UP
Jan 16 14:33:27 clustertest03 systemd-networkd[583]: lxdfan0-fan: Gained carrier
Jan 16 14:33:27 clustertest03 kernel: lxdfan0: port 2(lxdfan0-fan) entered blocking state
Jan 16 14:33:27 clustertest03 kernel: lxdfan0: port 2(lxdfan0-fan) entered forwarding state
Jan 16 14:33:27 clustertest03 systemd-udevd[4997]: Using default interface naming scheme 'v249'.
Jan 16 14:33:28 clustertest03 audit[5006]: AVC apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd_dnsmasq-lxdfan0_</var/snap/lxd/common/lxd>" pid=5006 comm="apparmor_parser"
Jan 16 14:33:28 clustertest03 kernel: kauditd_printk_skb: 18 callbacks suppressed
Jan 16 14:33:28 clustertest03 kernel: audit: type=1400 audit(1673901208.178:89): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd_dnsmasq-lxdfan0_</var/snap/lxd/common/lxd>" pid=5006 comm="apparmor_parser"
Jan 16 14:33:28 clustertest03 systemd[1]: Started Service for snap application microceph.mds.
Jan 16 14:33:28 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:28.222-0600 7f900d96ddc0 -1 mgr[py] Module orchestrator has missing NOTIFY_TYPES member
Jan 16 14:33:28 clustertest03 audit[5009]: AVC apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" pid=5009 comm="apparmor_parser"
Jan 16 14:33:28 clustertest03 kernel: audit: type=1400 audit(1673901208.234:90): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" pid=5009 comm="apparmor_parser"
Jan 16 14:33:28 clustertest03 microceph.mds[5010]: starting mds.clustertest03 at
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: started, version 2.80 cachesize 150
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth nettlehash DNSSEC loop-detect inotify dumpfile
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: LOUD WARNING: listening on 240.177.0.1 may accept requests via interfaces other than lxdfan0
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: LOUD WARNING: use --bind-dynamic rather than --bind-interfaces to avoid DNS amplification attacks via these interface(s)
Jan 16 14:33:28 clustertest03 dnsmasq-dhcp[5019]: DHCP, IP range 240.177.0.2 -- 240.177.0.254, lease time 1h
Jan 16 14:33:28 clustertest03 dnsmasq-dhcp[5019]: DHCP, sockets bound exclusively to interface lxdfan0
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: using nameserver 240.177.0.1#1053 for domain 240.in-addr.arpa
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: using nameserver 240.177.0.1#1053 for domain lxd
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: reading /etc/resolv.conf
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: using nameserver 240.177.0.1#1053 for domain 240.in-addr.arpa
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: using nameserver 240.177.0.1#1053 for domain lxd
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: using nameserver 127.0.0.53#53
Jan 16 14:33:28 clustertest03 dnsmasq[5019]: read /etc/hosts - 5 addresses
Jan 16 14:33:28 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:28.710-0600 7f900d96ddc0 -1 mgr[py] Module osd_perf_query has missing NOTIFY_TYPES member
Jan 16 14:33:28 clustertest03 audit[5054]: AVC apparmor="DENIED" operation="open" profile="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" name="/var/lib/snapd/hostfs/run/systemd/resolve/stub-resolv.conf" pid=5054 comm="lxd" requested_mask="r" denied_mask="r" fsuid=999 ouid=101
Jan 16 14:33:28 clustertest03 kernel: audit: type=1400 audit(1673901208.754:91): apparmor="DENIED" operation="open" profile="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" name="/var/lib/snapd/hostfs/run/systemd/resolve/stub-resolv.conf" pid=5054 comm="lxd" requested_mask="r" denied_mask="r" fsuid=999 ouid=101
Jan 16 14:33:28 clustertest03 systemd-networkd[583]: lxdfan0: Gained carrier
Jan 16 14:33:28 clustertest03 audit[5054]: AVC apparmor="DENIED" operation="open" profile="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" name="/proc/5054/cpuset" pid=5054 comm="lxd" requested_mask="r" denied_mask="r" fsuid=999 ouid=999
Jan 16 14:33:28 clustertest03 kernel: audit: type=1400 audit(1673901208.774:92): apparmor="DENIED" operation="open" profile="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" name="/proc/5054/cpuset" pid=5054 comm="lxd" requested_mask="r" denied_mask="r" fsuid=999 ouid=999
Jan 16 14:33:28 clustertest03 lxd-forkdns[5054]: time="2023-01-16T14:33:28-06:00" level=info msg=Started
Jan 16 14:33:28 clustertest03 lxd-forkdns[5054]: time="2023-01-16T14:33:28-06:00" level=info msg="Server list loaded: []"
Jan 16 14:33:28 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:28.862-0600 7f900d96ddc0 -1 mgr[py] Module osd_support has missing NOTIFY_TYPES member
Jan 16 14:33:28 clustertest03 lxd.daemon[4597]: time="2023-01-16T14:33:28-06:00" level=warning msg="Failed adding member event listener client" err="dial tcp 0.0.0.0:443: connect: connection refused" local="192.168.86.177:8443" remote=0.0.0.0
Jan 16 14:33:29 clustertest03 microceph.mon[4659]: 2023-01-16T14:33:29.078-0600 7faee8ff9640 -1 mon.clustertest03@0(leader) e2  stashing newest monmap 2 for next startup
Jan 16 14:33:29 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:29.106-0600 7f900d96ddc0 -1 mgr[py] Module pg_autoscaler has missing NOTIFY_TYPES member
Jan 16 14:33:29 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:29.218-0600 7f900d96ddc0 -1 mgr[py] Module progress has missing NOTIFY_TYPES member
Jan 16 14:33:29 clustertest03 microcloud.daemon[4184]: time="2023-01-16T14:33:29-06:00" level=error msg="Failed to get dqlite leader" address="https://192.168.86.177:9443" error="no available dqlite leader server found"
Jan 16 14:33:29 clustertest03 systemd-networkd[583]: lxdfan0-mtu: Link DOWN
Jan 16 14:33:29 clustertest03 systemd-networkd[583]: lxdfan0-mtu: Lost carrier
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 1(lxdfan0-mtu) entered disabled state
Jan 16 14:33:29 clustertest03 kernel: device lxdfan0-mtu left promiscuous mode
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 1(lxdfan0-mtu) entered disabled state
Jan 16 14:33:29 clustertest03 systemd-networkd[583]: lxdfan0-fan: Link DOWN
Jan 16 14:33:29 clustertest03 systemd-networkd[583]: lxdfan0-fan: Lost carrier
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 2(lxdfan0-fan) entered disabled state
Jan 16 14:33:29 clustertest03 kernel: device lxdfan0-fan left promiscuous mode
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 2(lxdfan0-fan) entered disabled state
Jan 16 14:33:29 clustertest03 networkd-dispatcher[630]: WARNING:Unknown index 6 seen, reloading interface list
Jan 16 14:33:29 clustertest03 systemd-udevd[5002]: Using default interface naming scheme 'v249'.
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 1(lxdfan0-mtu) entered blocking state
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 1(lxdfan0-mtu) entered disabled state
Jan 16 14:33:29 clustertest03 kernel: device lxdfan0-mtu entered promiscuous mode
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 1(lxdfan0-mtu) entered blocking state
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 1(lxdfan0-mtu) entered forwarding state
Jan 16 14:33:29 clustertest03 kernel: device lxdfan0-mtu left promiscuous mode
Jan 16 14:33:29 clustertest03 kernel: device lxdfan0-mtu entered promiscuous mode
Jan 16 14:33:29 clustertest03 systemd-networkd[583]: lxdfan0-mtu: Link UP
Jan 16 14:33:29 clustertest03 systemd-networkd[583]: lxdfan0-mtu: Gained carrier
Jan 16 14:33:29 clustertest03 systemd-networkd[583]: lxdfan0-mtu: Gained IPv6LL
Jan 16 14:33:29 clustertest03 avahi-daemon[620]: Withdrawing address record for 240.177.0.1 on lxdfan0.
Jan 16 14:33:29 clustertest03 avahi-daemon[620]: Leaving mDNS multicast group on interface lxdfan0.IPv4 with address 240.177.0.1.
Jan 16 14:33:29 clustertest03 avahi-daemon[620]: Interface lxdfan0.IPv4 no longer relevant for mDNS.
Jan 16 14:33:29 clustertest03 avahi-daemon[620]: Joining mDNS multicast group on interface lxdfan0.IPv4 with address 240.177.0.1.
Jan 16 14:33:29 clustertest03 avahi-daemon[620]: New relevant interface lxdfan0.IPv4 for mDNS.
Jan 16 14:33:29 clustertest03 avahi-daemon[620]: Registering new address record for 240.177.0.1 on lxdfan0.IPv4.
Jan 16 14:33:29 clustertest03 networkd-dispatcher[630]: WARNING:Unknown index 7 seen, reloading interface list
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 2(lxdfan0-fan) entered blocking state
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 2(lxdfan0-fan) entered disabled state
Jan 16 14:33:29 clustertest03 kernel: device lxdfan0-fan entered promiscuous mode
Jan 16 14:33:29 clustertest03 systemd-networkd[583]: lxdfan0-fan: Link UP
Jan 16 14:33:29 clustertest03 systemd-networkd[583]: lxdfan0-fan: Gained carrier
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 2(lxdfan0-fan) entered blocking state
Jan 16 14:33:29 clustertest03 kernel: lxdfan0: port 2(lxdfan0-fan) entered forwarding state
Jan 16 14:33:29 clustertest03 kernel: audit: type=1400 audit(1673901209.790:93): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxd_dnsmasq-lxdfan0_</var/snap/lxd/common/lxd>" pid=5156 comm="apparmor_parser"
Jan 16 14:33:29 clustertest03 kernel: audit: type=1400 audit(1673901209.798:94): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" pid=5158 comm="apparmor_parser"
Jan 16 14:33:29 clustertest03 audit[5156]: AVC apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxd_dnsmasq-lxdfan0_</var/snap/lxd/common/lxd>" pid=5156 comm="apparmor_parser"
Jan 16 14:33:29 clustertest03 audit[5158]: AVC apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" pid=5158 comm="apparmor_parser"
Jan 16 14:33:30 clustertest03 microceph.daemon[3565]: time="2023-01-16T14:33:30-06:00" level=error msg="Failed to get dqlite leader" address="https://192.168.86.177:7443" error="no available dqlite leader server found"
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: started, version 2.80 cachesize 150
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth nettlehash DNSSEC loop-detect inotify dumpfile
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: LOUD WARNING: listening on 240.177.0.1 may accept requests via interfaces other than lxdfan0
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: LOUD WARNING: use --bind-dynamic rather than --bind-interfaces to avoid DNS amplification attacks via these interface(s)
Jan 16 14:33:30 clustertest03 dnsmasq-dhcp[5176]: DHCP, IP range 240.177.0.2 -- 240.177.0.254, lease time 1h
Jan 16 14:33:30 clustertest03 dnsmasq-dhcp[5176]: DHCP, sockets bound exclusively to interface lxdfan0
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: using nameserver 240.177.0.1#1053 for domain 240.in-addr.arpa
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: using nameserver 240.177.0.1#1053 for domain lxd
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: reading /etc/resolv.conf
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: using nameserver 240.177.0.1#1053 for domain 240.in-addr.arpa
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: using nameserver 240.177.0.1#1053 for domain lxd
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: using nameserver 127.0.0.53#53
Jan 16 14:33:30 clustertest03 dnsmasq[5176]: read /etc/hosts - 5 addresses
Jan 16 14:33:30 clustertest03 audit[5178]: AVC apparmor="DENIED" operation="open" profile="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" name="/var/lib/snapd/hostfs/run/systemd/resolve/stub-resolv.conf" pid=5178 comm="lxd" requested_mask="r" denied_mask="r" fsuid=999 ouid=101
Jan 16 14:33:30 clustertest03 kernel: audit: type=1400 audit(1673901210.422:95): apparmor="DENIED" operation="open" profile="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" name="/var/lib/snapd/hostfs/run/systemd/resolve/stub-resolv.conf" pid=5178 comm="lxd" requested_mask="r" denied_mask="r" fsuid=999 ouid=101
Jan 16 14:33:30 clustertest03 systemd[1]: Reloading.
Jan 16 14:33:30 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:30.438-0600 7f900d96ddc0 -1 mgr[py] Module prometheus has missing NOTIFY_TYPES member
Jan 16 14:33:30 clustertest03 audit[5178]: AVC apparmor="DENIED" operation="open" profile="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" name="/proc/5178/cpuset" pid=5178 comm="lxd" requested_mask="r" denied_mask="r" fsuid=999 ouid=999
Jan 16 14:33:30 clustertest03 kernel: audit: type=1400 audit(1673901210.454:96): apparmor="DENIED" operation="open" profile="lxd_forkdns-lxdfan0_</var/snap/lxd/common/lxd>" name="/proc/5178/cpuset" pid=5178 comm="lxd" requested_mask="r" denied_mask="r" fsuid=999 ouid=999
Jan 16 14:33:30 clustertest03 lxd-forkdns[5178]: time="2023-01-16T14:33:30-06:00" level=info msg=Started
Jan 16 14:33:30 clustertest03 lxd-forkdns[5178]: time="2023-01-16T14:33:30-06:00" level=info msg="Server list loaded: []"
Jan 16 14:33:30 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:30.690-0600 7f900d96ddc0 -1 mgr[py] Module rbd_support has missing NOTIFY_TYPES member
Jan 16 14:33:30 clustertest03 systemd[1]: Started Service for snap application microceph.osd.
Jan 16 14:33:31 clustertest03 avahi-daemon[620]: Joining mDNS multicast group on interface lxdfan0-fan.IPv6 with address fe80::b0cd:cfff:fe0d:628a.
Jan 16 14:33:31 clustertest03 avahi-daemon[620]: New relevant interface lxdfan0-fan.IPv6 for mDNS.
Jan 16 14:33:31 clustertest03 systemd-networkd[583]: lxdfan0-fan: Gained IPv6LL
Jan 16 14:33:31 clustertest03 avahi-daemon[620]: Registering new address record for fe80::b0cd:cfff:fe0d:628a on lxdfan0-fan.*.
Jan 16 14:33:31 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:31.578-0600 7f900d96ddc0 -1 mgr[py] Module selftest has missing NOTIFY_TYPES member
Jan 16 14:33:31 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:31.694-0600 7f900d96ddc0 -1 mgr[py] Module snap_schedule has missing NOTIFY_TYPES member
Jan 16 14:33:32 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:32.002-0600 7f900d96ddc0 -1 mgr[py] Module status has missing NOTIFY_TYPES member
Jan 16 14:33:32 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:32.102-0600 7f900d96ddc0 -1 mgr[py] Module telegraf has missing NOTIFY_TYPES member
Jan 16 14:33:32 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:32.450-0600 7f900d96ddc0 -1 mgr[py] Module telemetry has missing NOTIFY_TYPES member
Jan 16 14:33:32 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:32.722-0600 7f900d96ddc0 -1 mgr[py] Module test_orchestrator has missing NOTIFY_TYPES member
Jan 16 14:33:33 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:33.074-0600 7f900d96ddc0 -1 mgr[py] Module volumes has missing NOTIFY_TYPES member
Jan 16 14:33:33 clustertest03 microceph.mgr[4800]: 2023-01-16T14:33:33.178-0600 7f900d96ddc0 -1 mgr[py] Module zabbix has missing NOTIFY_TYPES member
Jan 16 14:33:34 clustertest03 sshd[1032]: Received disconnect from 192.168.86.57 port 56880:11: disconnected by user
Jan 16 14:33:34 clustertest03 sshd[1032]: Disconnected from user josh 192.168.86.57 port 56880
Jan 16 14:33:34 clustertest03 sshd[977]: pam_unix(sshd:session): session closed for user josh
Jan 16 14:33:34 clustertest03 systemd[1]: session-4.scope: Deactivated successfully.
Jan 16 14:33:34 clustertest03 systemd[1]: session-4.scope: Consumed 16.377s CPU time.
Jan 16 14:33:34 clustertest03 systemd-logind[635]: Session 4 logged out. Waiting for processes to exit.
Jan 16 14:33:34 clustertest03 systemd-logind[635]: Removed session 4.
Jan 16 14:33:36 clustertest03 microcloud.daemon[4184]: time="2023-01-16T14:33:36-06:00" level=error msg="Failed to get dqlite leader" address="https://192.168.86.177:9443" error="no available dqlite leader server found"
Jan 16 14:33:36 clustertest03 microceph.daemon[3565]: time="2023-01-16T14:33:36-06:00" level=error msg="Failed to get dqlite leader" address="https://192.168.86.177:7443" error="no available dqlite leader server found"
Jan 16 14:33:42 clustertest03 microcloud.daemon[4184]: time="2023-01-16T14:33:42-06:00" level=error msg="Failed to get dqlite leader" address="https://192.168.86.177:9443" error="no available dqlite leader server found"
Jan 16 14:33:43 clustertest03 microceph.daemon[3565]: time="2023-01-16T14:33:43-06:00" level=error msg="Failed to get dqlite leader" address="https://192.168.86.177:7443" error="no available dqlite leader server found"
***last two log entries repeated forever***

chenchen · January 17, 2023, 7:15am

Great! Our own work team has 3 environment，all of them based on LXD cluster，OVN cluster，CEPH cluster。We often need to re-deploy a set of environments，the steps are too complicated ! It would be great if microceph & microcloud could integrate ovn ! that will save half of my working time !

masnax · January 18, 2023, 12:26am

So it does look like dqlite isn’t properly getting set up here. Just to confirm, microcloud cluster list on the initial node always shows UNREACHABLE after the cluster setup?

Things to tick off first would be ensuring the machines are using the right addresses for microcloud and ensuring there’s no firewall rules that might be interfering. Microcloud will also need ports 7443,8443, and 9443 for the services.

Is there anything else you’re running on these machines or are they otherwise clean images?

bolapara · January 18, 2023, 12:56am

This last time I tried about 15 minutes ago and initiating host (clustertest06 in this case) stills shows as unreachable.

The IP addresses are correct and they are addresses on my local LAN. The images are either base ubuntu server 22.04.1 or jammy cloud-images from 1/10/23. These machines are VMs that are freshly installed (many times) specifically for this test. ufw is disabled and iptables -L shows no rules. The VM server hosting these VMs host a bunch of other infrastructure for me so I know it’s good there.

port scan from my laptop shows the ports as closed, not filtered:

0 230117 18:48:09 (master) fw:cluster josh $ nmap -p 7443,8443,9443 clustertest0{6..8}
Starting Nmap 7.93 ( https://nmap.org ) at 2023-01-17 18:49 CST
Nmap scan report for clustertest06 (192.168.86.181)
Host is up (0.0031s latency).
rDNS record for 192.168.86.181: clustertest06.lan

PORT     STATE SERVICE
7443/tcp open  oracleas-https
8443/tcp open  https-alt
9443/tcp open  tungsten-https

Nmap scan report for clustertest07 (192.168.86.182)
Host is up (0.0032s latency).
rDNS record for 192.168.86.182: clustertest07.lan

PORT     STATE  SERVICE
7443/tcp closed oracleas-https
8443/tcp closed https-alt
9443/tcp closed tungsten-https

Nmap scan report for clustertest08 (192.168.86.183)
Host is up (0.0031s latency).
rDNS record for 192.168.86.183: clustertest08.lan

PORT     STATE  SERVICE
7443/tcp closed oracleas-https
8443/tcp closed https-alt
9443/tcp closed tungsten-https

Nmap done: 3 IP addresses (3 hosts up) scanned in 0.15 seconds

One thing that stands out to me are the apparmor denials but those remain even when I disable apparmor on the hosts, I’m assuming somehow apparmor runs in the snap containers?

I also tried only setting up microceph and it also fails with dqlite errors.