Introducing MicroCeph

itoffshore · November 13, 2023, 12:02am

If you are running microceph with partitions (rather than whole disks) - apparmor needs to be put in complain mode for OSD creation to succeed - if you see errors like this in dmesg:

[ 1549.859092] audit: type=1400 audit(1699830490.855:88): apparmor="DENIED" operation="open" profile="snap.microceph.daemon" name="/dev/sr0" pid=28283 comm="microcephd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[ 1550.766299] audit: type=1400 audit(1699830491.755:89): apparmor="DENIED" operation="open" profile="snap.microceph.daemon" name="/dev/vda3" pid=29327 comm="ceph-osd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[ 1550.767410] audit: type=1400 audit(1699830491.755:90): apparmor="DENIED" operation="open" profile="snap.microceph.daemon" name="/dev/vda3" pid=29327 comm="ceph-osd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[ 1550.767420] audit: type=1400 audit(1699830491.755:91): apparmor="DENIED" operation="open" profile="snap.microceph.daemon" name="/dev/vda3" pid=29327 comm="ceph-osd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[ 1550.773590] audit: type=1400 audit(1699830491.763:92): apparmor="DENIED" operation="open" profile="snap.microceph.daemon" name="/dev/vda3" pid=29327 comm="ceph-osd" requested_mask="wrc" denied_mask="wrc" fsuid=0 ouid=0

To temporarily enable complain mode:

echo -n complain > /sys/module/apparmor/parameters/mode

This allowed microceph init to succeed:
The following firewall outbound destination ports need opening tcp 3300 6789 7443 & additionally for OSD’s destination ports tcp 6800-6810 is recommended
Having the ceph daemon listening on a wireguard interface works ok
microceph.ceph health detail is a useful command
It took about 20 minutes for ceph to become happy - this would probably have been quicker if I restarted snap.microceph.osd.service after setting the correct firewall rules.
Adding partitions from /dev/disk/by-path works:

[root@host4 ~]# microceph.ceph status

  cluster:
    id:     61ee0596-5913-48c2-92dd-7d24d74bd979
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum host1,host3,host4 (age 90m)
    mgr: host1(active, since 2h), standbys: host3, host4
    osd: 4 osds: 4 up (since 53m), 4 in (since 54m)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 449 KiB
    usage:   84 MiB used, 152 GiB / 152 GiB avail
    pgs:     1 active+clean

The above healthy cluster is with 4 x 38gb nvme disk partitions & 1gbps ports connected with wireguard using wg-meshconf (forked to add preshared keys)
Enabling wg-quick@interface_name services was less problematic than configuring wireguard with systemd-networkd (which worked & then stopped setting the routes & giving wg0 an ip address after a reboot)
If you are running ceph from a partition on a single disk - copy line 619 in /var/lib/snapd/apparmor/profiles/snap.microceph.osd to allow your partition (e.g add a line of /dev/vda3 rwk, - so ceph still works after a reboot) - & perhaps make the file immutable with chattr +i until this is fixed.