My Two Cents: LXD cluster with CEPH storage backend

Thanks to the inputs from the others on the Internet, expecially the Linux Containers, I have gathered enough information to finish my project. It is time for me to do my part to share my experience here. . Hopefully, it could be helpful to anyone.

I have been playing around with LXDd for a couple of years. Recently, I tried to setup a cluster of LXD nodes using remote storage. My choice is to use ceph as the backend storage.

First of all, this post is based on a video clip by Dimzrio, “Dimzrio Tutorial” on YouTube. As I am not a video person, I have transcribed its content and tested them here.

The system that I have set up consists of 4 nodes. They have similar configuration as below:

  • Ubuntu 20.04
  • One RAID 1 drive for the OS;
  • Two hard drives (2T) for the ceph;
  • RAM is around 64G each;

(I used cssh to execute the same command in all nodes. You need to configure sshd to allow root access.)

First thing first, install the latest LXD
In Ubuntu 20.04, the default snap package of LXD installed is 4.0. To upgrade it to the latest stable version which is, up to the writing of this post, 4.8:

sudo snap refresh lxd --channel=4.8/stable

Secondly, install ceph in all nodes

sudo apt install ceph -y

In Node1, get the UUID

$uuidgen

make a note of the uuid just generated.

In all nodes
export the generated UUID using:

export cephuid=ABCDEFG <----- replace “ABCDEFG” with the uuid generated.

sudo vi /etc/ceph/ceph.conf

Insert the following into /etc/ceph/ceph.conf

[global]
fsid=ABCDEFG
mon_initial_members = node1, node2, node3, node4
mon_host = node1_ip, node2_ip, node3_ip, node4_ip
public_network = 192.168.1.0/24 <---- replace this with your network
auth_cluster_required = none
auth_service_required = none
auth_client_required = none
osd_journal_size = 1024
osd_pool_default_size = 3
osd_pool_default_min_size = 2
osd_pool_default_pg_num = 333
osd_pool_default_pgp_num = 333
osd_crush_chooseleaf_type = 1

In Node1
Create ceph monitor secret key (host only)

ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon "allow *"
ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --cap mon "allow *" --cap mgr "allow *" --cap osd "allow *" --cap mds "allow *"
ceph-authtool --create-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring --gen-key -n client.bootstrap-osd --cap mon "profile bootstrap-osd" --cap mgr "allow r"
ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
ceph-authtool /tmp/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring

Generate ceph monitor map

monmaptool --create --add node1 node1_ip --add node2 node2_ip --add node3 node3_ip --add node4 node4_ip --fsid $cephuid /tmp/monmap

Copy the monmap to all other nodes:

scp /tmp/monmap node2_ip:/tmp
scp /tmp/monmap node3_ip:/tmp
scp /tmp/monmap node4_ip:/tmp

Copy the ceph.client.admin.keyring to all other nodes:

scp /etc/ceph/ceph.client.admin.keyring node2_ip:/etc/ceph
scp /etc/ceph/ceph.client.admin.keyring node3_ip:/etc/ceph
scp /etc/ceph/ceph.client.admin.keyring node4_ip:/etc/ceph

Copy ceph.mon.keyring to all other nodes:

scp /tmp/ceph.mon.keyring node2_ip:/tmp
scp /tmp/ceph.mon.keyring node3_ip:/tmp
scp /tmp/ceph.mon.keyring node4_ip:/tmp

Create data directory for monitor

In Node1
sudo -u ceph mkdir /var/lib/ceph/mon/ceph-node1
sudo -u ceph ceph-mon --mkfs -i node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
ls /var/lib/ceph/mon/ceph-node1
systemctl restart ceph-mon@node1

In Node2
sudo -u ceph mkdir /var/lib/ceph/mon/ceph-node2
sudo -u ceph ceph-mon --mkfs -i node2 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
systemctl restart ceph-mon@node2

In Node3
sudo -u ceph mkdir /var/lib/ceph/mon/ceph-node3
sudo -u ceph ceph-mon --mkfs -i node3 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
systemctl restart ceph-mon@node3

In Node4
sudo -u ceph mkdir /var/lib/ceph/mon/ceph-node4
sudo -u ceph ceph-mon --mkfs -i node4 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
systemctl restart ceph-mon@node4

Setup ceph manager

In Node1
ceph auth get-or-create mgr.node1 mon 'allow profile mgr' osd 'allow *' mds 'allow *'

Make a note of the displayed key

 sudo -u ceph mkdir /var/lib/ceph/mgr/ceph-node1
 sudo -u ceph nano /var/lib/ceph/mgr/ceph-node1/keyring

Insert the key into the file

[mgr.node1]
key =

ceph mon enable-msgr2
systemctl restart ceph-mgr@node1

In Node2
ceph auth get-or-create mgr.node2 mon 'allow profile mgr' osd 'allow *' mds 'allow *'

Make a note of the displayed key

sudo -u ceph mkdir /var/lib/ceph/mgr/ceph-node2
sudo -u ceph nano /var/lib/ceph/mgr/ceph-node2/keyring

Insert the key into the file

[mgr.node2]
key =

systemctl restart ceph-mgr@node2

In Node3
ceph auth get-or-create mgr.node3 mon 'allow profile mgr' osd 'allow *' mds 'allow *'

Make a note of the displayed key

sudo -u ceph mkdir /var/lib/ceph/mgr/ceph-node3
sudo -u ceph nano /var/lib/ceph/mgr/ceph-node3/keyring

Insert the key into the file

[mgr.node3]
key =

systemctl restart ceph-mgr@node3

In Node 4
ceph auth get-or-create mgr.node4 mon 'allow profile mgr' osd 'allow *' mds 'allow *'

Make a note of the displayed key

sudo -u ceph mkdir /var/lib/ceph/mgr/ceph-node4
sudo -u ceph nano /var/lib/ceph/mgr/ceph-node4/keyring

Insert the key into the file

[mgr.node4]
key =

systemctl restart ceph-mgr@node4

Setup NTP client

In Ubuntu 20.04, it can be done by timedatectl. In all nodes,

sudo timedatectl set-timezone Asia/Hong_Kong <---- change to your time zone

configure it to sync with a ntp server.

Setup ceph osd

In all nodes:

sudo ceph-volume lvm create --data /dev/sdb (or the specific lv)
sudo systemctl restart ceph-osd@# 

(# is the node number starting from 0. One unique number for each volume). In my case, I have

node1: ceph-osd@0, ceph=osd@1
node2: ceph-osd@2, ceph=osd@3
node3: ceph-osd@4, ceph=osd@5
node4: ceph-osd@6, ceph=osd@7

If you did something wrong, you can remove the osd by

ceph osd out osd.#
ceph osd purge purge osd.# --force
ceph-volume lvm zap --destroy /dev/sd#

Setup ceph mds

In Node1
sudo -u ceph mkdir /var/lib/ceph/mds/ceph-node1 -p
ceph-authtool --create-keyring /var/lib/ceph/mds/ceph-node1/keyring --gen-key -n mds.node1
ceph auth add mds.node1 osd “allow rwx” mds “allow” mon “allow profile mds” -i /var/lib/ceph/mds/ceph-node1/keyring
systemctl restart ceph-mds@node1

In Node2
sudo -u ceph mkdir /var/lib/ceph/mds/ceph-node2 -p
ceph-authtool --create-keyring /var/lib/ceph/mds/ceph-node2/keyring --gen-key -n mds.node2
ceph auth add mds.node2 osd “allow rwx” mds “allow” mon “allow profile mds” -i /var/lib/ceph/mds/ceph-node2/keyring
systemctl restart ceph-mds@node2

In Node3
sudo -u ceph mkdir /var/lib/ceph/mds/ceph-node3 -p
ceph-authtool --create-keyring /var/lib/ceph/mds/ceph-node3/keyring --gen-key -n mds.node3
ceph auth add mds.node3 osd “allow rwx” mds “allow” mon “allow profile mds” -i /var/lib/ceph/mds/ceph-node3/keyring
systemctl restart ceph-mds@node3

In Node4
sudo -u ceph mkdir /var/lib/ceph/mds/ceph-node4 -p
ceph-authtool --create-keyring /var/lib/ceph/mds/ceph-node4/keyring --gen-key -n mds.node4
ceph auth add mds.node4 osd “allow rwx” mds “allow” mon “allow profile mds” -i /var/lib/ceph/mds/ceph-node4/keyring
systemctl restart ceph-mds@node4

add the following into /etc/ceph/ceph.conf in all nodes:

  [mds.node1]
  host = node1
  
  [mds.node2]
  host = node2
  
  [mds.node3]
  host = node3
  
  [mds.node4]
  host = node4

Restart all ceph services

In Node1,
systemctl restart ceph-mon@node1
systemctl restart ceph-mgr@node1
systemctl restart ceph-mds@node1
systemctl restart ceph-osd@0
systemctl restart ceph-osd@1

In Node2,
systemctl restart ceph-mon@node2
systemctl restart ceph-mgr@node2
systemctl restart ceph-mds@node2
systemctl restart ceph-osd@2
systemctl restart ceph-osd@3

In Node3,
systemctl restart ceph-mon@node3
systemctl restart ceph-mgr@node3
systemctl restart ceph-mds@node3
systemctl restart ceph-osd@4
systemctl restart ceph-osd@5

In Node4,
systemctl restart ceph-mon@node4
systemctl restart ceph-mgr@node4
systemctl restart ceph-mds@node4
systemctl restart ceph-osd@6
systemctl restart ceph-osd@7

Now, you may check the status of the newly configured ceph

ceph -s

To check the osd tree,

ceph osd tree

Create a new pool,

sudo ceph osd pool create lxd-ceph 250

250 is the placement group number calculated according to my setup. You may use a different one.

If the ceph is running fine, it is time to initiate the LXD

Initiate LXD

In Node1,

sudo lxd init

answer the qustions with the following answers:

Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=node1]:
What IP address or DNS name should be used to reach this node? [default=node1_ip]:
Are you joining an existing cluster? (yes/no) [default=no]: no
Setup password authentication on the cluster? (yes/no) [default=yes]: yes
Do you want to configure a new local storage pool? (yes/no) [default=yes]: no
Do you want to configure a new remote storage pool? (yes/no) [default=no]: yes
Name of the storage backend to use (ceph, cephfs) [default=ceph]: ceph
Create a new CEPH pool? (yes/no) [default=yes]:
Name of the existing CEPH cluster [default=ceph]:
Name of the OSD storage pool [default=lxd]: lxd-ceph
Number of placement groups [default=32]: 250
Would you like to connect to a MAAS server? (yes/no) [default=no]:
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]:
Would you like to create a new Fan overlay network? (yes/no) [default=yes]:
What subnet should be used as the Fan underlay? [default=auto]:
Would you like stale cached images to be updated automatically? (yes/no) [default=yes]
Would you like a YAML “lxd init” preseed to be printed? (yes/no) [default=no]:

The LXD should be up and running. Initialise the LXD in other nodes. Answer YES to “Are you joining an existing cluster?”.

Well, that is it! I hope that you will enjoy this.

Regards,
Terry Ng.

7 Likes

Ive been looking to provide keep alive in my app for ceph based systems but ive been to lazy to put all the ceph stuff together, this looks like what i need, awesome work - Thanks alot😊

you are welcome. Let me know if there is any mistake or improvements please.

Just tried this guide and it works fine.

My environment is composed of three virtual machines (as I didn’t have free devices in the host) to pass-through) all running Ubuntu Server 22.04, with lxd 5.0.2.

To test, I have successfully created a container, a volume and also mounted a cephFS filesystem on top of the cluster.

Thanks for sharing this @terryng !