Introducing MicroCeph

As you may have seen in some of our tutorials and videos, building up a Ceph cluster can be a bit tricky and time consuming, especially if it’s just for testing or a small home lab.

To make this much easier, we’ve been spending a bit of time creating something called microceph.

It’s available as a snap package, currently only tested on Ubuntu but likely to be working on other distros too. That snap uses a small management daemon that shares a bunch of the clustering logic that LXD also uses. This allows for very easy clustering of multiple systems together which combined with an easy bootstrap process allows for setting up a Ceph cluster in just a few minutes.

If you’d like to try it out, you can just run snap install microceph followed with microceph init.
This will run you through the setup process interactively. The first system will create a new Ceph cluster, let you add additional systems and then add disks.

You can use this on a single system with at least 3 disks or partitions though you’ll need to tweak the replication a little bit on your pools for this to work properly.

For a more standard setup, you’d want 3 systems each with at least 1 disk or partition.
All my development and testing was done by running microceph inside of 3 LXD virtual machines, each with an additional disk attached for use by Ceph.

Once all configured, you can run microceph.ceph status to make sure it’s all good.
The resulting Ceph configuration and keyring can be found at /var/snap/microceph/current/conf/ and can be copied over to LXD or any other application supporting Ceph.

Demo video:

10 Likes

Hi @stgraber, it looks like an amazing news. Can you give much more detail for microceph configuration?
Regards.

There’s not a ton of configuration to it really at this point.
microceph init will get it up and running, at which point you’ve got a Ceph running with mon, mds, mgr and osd services and those can be configured as normal through the use of microceph.ceph.

These days, it’s not recommended to use the ceph.conf file directly, instead, you can use the config command which lets you set configuration on specific daemons, machines, locations, …

1 Like

Fine, nice to hear that. Thanks @stgraber.

What are the differences between this deployment and one you would expect to see in a production environment?

It’s a lot easier to deploy MicroCeph than a traditional production deployment.
MicroCeph takes care of the initial service placement for HA, so you don’t really have to think about that.

For differences, one internal difference is that MicroCeph doesn’t use LVM to label the disks.
Instead the disks are recorded in the MicroCeph database and have the OSD spawned directly on them. This saves us from having to also drive LVM and makes things a bit tidier.

In general the goal is for MicroCeph to be usable in production as a way to have a Ceph that can very easily be setup across any number of machines. The versions of the various Ceph daemons are identical to what you’d get through Ubuntu 22.04 LTS as we’re actually consuming those packages.

So, would you consider this ready for a small production cluster right now? If not, what should I be watching to be completed before that?

I think we’ll want to wait for more users to play with this in homelabs and report back any obvious issue we didn’t see in our own use/testing so far before we feel confident telling folks to use this for small production sites.

So I’d probably want to give it another month at this point.

Hi,
I’m not so sure whether creating this post in here or a new one anyway I create a simple microceph environment but I cant figure out that error, may be someone can explain how to resolve this.
Regards.

root@cephnode1:~# ceph -s
Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')

root@cephnode1:~# microceph status
MicroCeph deployment summary:
- cephnode1 (10.193.206.20)
  Services: mds, mgr, mon, osd
  Disks: 1
- cephnode2 (10.193.206.21)
  Services: mds, mgr, mon, osd
  Disks: 1
- cephnode3 (10.193.206.22)
  Services: mds, mgr, mon, osd
  Disks: 1

Humm, I think, I found the exact problem, on the server cephnode1, the snap.microceph.daemon looks active(running) but logs prints error.

root@cephnode1:/var/log# systemctl status snap.microceph.daemon
● snap.microceph.daemon.service - Service for snap application microceph.daemon
     Loaded: loaded (/etc/systemd/system/snap.microceph.daemon.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-10-28 15:37:10 +03; 1s ago
   Main PID: 2176 (microcephd)
      Tasks: 8 (limit: 1117)
     Memory: 12.9M
        CPU: 595ms
     CGroup: /system.slice/snap.microceph.daemon.service
             └─2176 microcephd --state-dir /var/snap/microceph/common/state

Oct 28 15:37:10 cephnode1 systemd[1]: Started Service for snap application microceph.daemon.
root@cephnode1:/var/log# journalctl -f -u snap.microceph.daemon
Oct 28 15:40:20 cephnode1 systemd[1]: snap.microceph.daemon.service: Failed with result 'exit-code'.
Oct 28 15:40:20 cephnode1 systemd[1]: snap.microceph.daemon.service: Scheduled restart job, restart counter is at 32.
Oct 28 15:40:20 cephnode1 systemd[1]: Stopped Service for snap application microceph.daemon.
Oct 28 15:40:20 cephnode1 systemd[1]: Started Service for snap application microceph.daemon.
Oct 28 15:40:30 cephnode1 microceph.daemon[2648]: Error: Unable to start daemon: Daemon failed to start: Failed to re-establish cluster connection: context deadline exceeded
Oct 28 15:40:30 cephnode1 systemd[1]: snap.microceph.daemon.service: Main process exited, code=exited, status=1/FAILURE
Oct 28 15:40:30 cephnode1 systemd[1]: snap.microceph.daemon.service: Failed with result 'exit-code'.
Oct 28 15:40:31 cephnode1 systemd[1]: snap.microceph.daemon.service: Scheduled restart job, restart counter is at 33.
Oct 28 15:40:31 cephnode1 systemd[1]: Stopped Service for snap application microceph.daemon.
Oct 28 15:40:31 cephnode1 systemd[1]: Started Service for snap application microceph.daemon.

are 3 machines OK with microceph? What is happening if one of the machine goes down?

3 machines works just fine, ceph can run degraded on 2 machines just fine, you just don’t want to get any lower than that :slight_smile:

I run a production LXD+Ceph cluster in a datacenter on 3 servers where I do weekly rolling reboots for security updates, all services running on the remaining two servers run without any issue during the reboot of the 3rd server.

1 Like

I see, I May try it then. I’m for now figuring if a 25G network with 2x2T SAS drive is OK for it.

I have encountered some difficulties but completed the task and impressed with the result. I have installed 3 lxd vm and add external disks to them. Here are some outputs of my settings.
Thanks for the effort, regards.

indiana@lxdserver:~$ lxc ls
+-----------+---------+------------------------+-------------------------------------------------+-----------------+-----------+
|   NAME    |  STATE  |          IPV4          |                      IPV6                       |      TYPE       | SNAPSHOTS |
+-----------+---------+------------------------+-------------------------------------------------+-----------------+-----------+
| cephnode1 | RUNNING | 10.193.206.20 (enp5s0) | fd42:b852:d5da:febc:216:3eff:fe50:9d14 (enp5s0) | VIRTUAL-MACHINE | 0         |
+-----------+---------+------------------------+-------------------------------------------------+-----------------+-----------+
| cephnode2 | RUNNING | 10.193.206.21 (enp5s0) | fd42:b852:d5da:febc:216:3eff:fe02:cd4e (enp5s0) | VIRTUAL-MACHINE | 0         |
+-----------+---------+------------------------+-------------------------------------------------+-----------------+-----------+
| cephnode3 | RUNNING | 10.193.206.22 (enp5s0) | fd42:b852:d5da:febc:216:3eff:fe23:f558 (enp5s0) | VIRTUAL-MACHINE | 0         |
+-----------+---------+------------------------+-------------------------------------------------+-----------------+-----------+
indiana@lxdserver:~$ lxc config show cephnode1
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Ubuntu jammy amd64 (20221029_07:42)
  image.os: Ubuntu
  image.release: jammy
  image.serial: "20221029_07:42"
  image.type: disk-kvm.img
  image.variant: cloud
  volatile.base_image: 0f4ae1685fc659ce7b792f94dbef3d8e150254b4e7e8d150c5bfae80b252556c
  volatile.cloud-init.instance-id: 6de42753-9d50-4677-8b07-23bb6a951792
  volatile.eth0.host_name: tapabd00e33
  volatile.eth0.hwaddr: 00:16:3e:50:9d:14
  volatile.last_state.power: RUNNING
  volatile.uuid: a0b6947d-a3fb-4406-8072-a6a61b784314
  volatile.vsock_id: "37"
devices:
  data1:
    source: /dev/nvme0n1p1
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""
root@cephnode1:~# microceph cluster list
+-----------+--------------------+-------+------------------------------------------------------------------+--------+
|   NAME    |      ADDRESS       | ROLE  |                           FINGERPRINT                            | STATUS |
+-----------+--------------------+-------+------------------------------------------------------------------+--------+
| cephnode1 | 10.193.206.20:7000 | voter | b74f07ff41b40eaee6aff8341e61e582282346b07bf70dd73b2a8e3ff4ecec7f | ONLINE |
+-----------+--------------------+-------+------------------------------------------------------------------+--------+
| cephnode2 | 10.193.206.21:7000 | voter | ce5b121c8348d6d4481d3ec9c30998dec59c50a9a8ed49e0d49b4dff1bdfaace | ONLINE |
+-----------+--------------------+-------+------------------------------------------------------------------+--------+
| cephnode3 | 10.193.206.22:7000 | voter | 2e7a0500dd132023f561964380e97fca68322038fcea1dafe447fc05fa1ed90f | ONLINE |
+-----------+--------------------+-------+------------------------------------------------------------------+--------+
root@cephnode1:~# microceph disk list
Disks configured in MicroCeph:
+-----+-----------+----------------------------------------------------+
| OSD | LOCATION  |                        PATH                        |
+-----+-----------+----------------------------------------------------+
| 0   | cephnode2 | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_data1 |
+-----+-----------+----------------------------------------------------+
| 1   | cephnode3 | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_data1 |
+-----+-----------+----------------------------------------------------+
| 2   | cephnode1 | /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_data1 |
+-----+-----------+----------------------------------------------------+

Available unpartitioned disks on this system:
+-------+----------+------+------+
| MODEL | CAPACITY | TYPE | PATH |
+-------+----------+------+------+

And from the client point of view, I have installed the ceph-common package to the client and after the installation, copied ceph.conf and ceph.client.admin.keyring from any cephnode to the /etc/ceph directory and here is the ceph status.

indiana@lxdserver:~$ ceph -s
  cluster:
    id:     61bfdca6-3de5-429b-a73a-fae9e912b8d9
    health: HEALTH_WARN
            3 osds down
            3 hosts (3 osds) down
            1 root (3 osds) down
            Reduced data availability: 1 pg inactive
 
  services:
    mon: 3 daemons, quorum cephnode1,cephnode2,cephnode3 (age 46m)
    mgr: cephnode2(active, since 47m), standbys: cephnode1, cephnode3
    osd: 3 osds: 0 up (since 16m), 3 in (since 2h)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     100.000% pgs unknown
             1 unknown

I have installed the microceph snap and encountered following warnings. Is this expected?
I am doing this on Ubuntu 22.04.1 with LXD 5.7/stable.

ubuntu@lxd-host-01:~$ sudo snap install microceph
2022-11-02T11:04:32+05:00 INFO snap “microceph” has bad plugs or slots: microceph (unknown interface “microceph”)
2022-11-02T11:04:34+05:00 INFO snap “microceph” has bad plugs or slots: microceph (unknown interface “microceph”)
microceph 0+git.499c15f from Canonical✓ installed
WARNING: There is 1 new warning. See ‘snap warnings’.
ubuntu@lxd-host-01:~$ snap warnings
last-occurrence: today at 11:04 PKT
warning: |
snap “microceph” has bad plugs or slots: microceph (unknown interface “microceph”)

Those will go away pretty shortly, we’re effectively waiting for a new snapd release that will support this interface.

It’s possible to avoid it by running the edge version of snapd with snap refresh snapd --edge

We will be starting testing this pretty much immediately @stgraber

I would hope to see the juju charms follow along with this somehow for LXD as it adds the operational side of things for us. We are currently half way through setting up full scale LXD cluster with Juju and have hit some bugs which we would like to work with you on.

Is this a plain ceph instalation ? nothing with orchestrators like juju, cephadm, etc. right?