LXD cluster on Raspberry Pi 4

stgraber · October 2, 2020, 4:56pm

Introduction

Would you like a very compact, silent, yet surprisingly powerful home lab capable of running containers or virtual machines, accessible from any system on your network?

That’s what’s possible these days using very cheap ARM boards like the Raspberry Pi.
Those boards have become more and more powerful over the years and are now even capable of running full virtual machines. Combined with LXD’s ability to cluster systems together, it’s now easier than ever to setup a lab which can be easily grown in the future.

Ubuntu has now released a dedicated LXD appliance targeting both the Raspberry Pi 4 and traditional Intel systems.

Hardware

In this setup, I’ll be using 3 of the newest Raspberry Pi 4 in their 8GB configuration.
The same will work fine on the 2GB or 4GB models, but if you intend to run virtual machines, try to stick to 4GB or 8GB models.

All 3 boards are connected to the same network and you’ll need an HDMI capable display and a USB keyboard for the initial setup. Once that’s done, everything can be done remotely over SSH and the LXD API.

It’s certainly possible to get this going on a single board or on far more than 3 boards, but 3 is the minimum number for the LXD database to be highly available.

Also worthy of note. In my setup, I didn’t have fast USB 3.1 external drives to use with this cluster, so I’m just using a loop file on the microSD card.
This is rather slow and small storage. If you have access to fast external storage, plug that in and specify it as an “existing empty disk or partition” below.

Installation

First, you’ll need to download the LXD appliance image from Install LXD on a Raspberry Pi | Ubuntu and follow the instructions to write it to a microSD card and load it on your Raspberry Pi boards.

Once booted, each of the boards will ask you for your Ubuntu account, it will then create your user and import your SSH key. At the end, you’re presented with the IP address of the board and you can SSH into them.

Configuration

The appliance image is setup for standalone use but we want to cluster them together so we actually need to undo a bit of the configuration that was automatically applied.

SSH into each of your boards and run:

sudo lxc profile device remove default root
sudo lxc profile device remove default eth0
sudo lxc storage delete local
sudo lxc config unset core.https_address

Then on the first board, run sudo lxd init and go through the steps as shown below:

Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=localhost]: rpi01
What IP address or DNS name should be used to reach this node? [default=10.166.11.235]: 
Are you joining an existing cluster? (yes/no) [default=no]: 
Setup password authentication on the cluster? (yes/no) [default=yes]: 
Trust password for new clients: 
Again: 
Do you want to configure a new local storage pool? (yes/no) [default=yes]: 
Name of the storage backend to use (btrfs, dir, lvm) [default=btrfs]:
Create a new BTRFS pool? (yes/no) [default=yes]:
Would you like to use an existing empty disk or partition? (yes/no) [default=no]: 
Size in GB of the new loop device (1GB minimum) [default=5GB]: 20GB
Do you want to configure a new remote storage pool? (yes/no) [default=no]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: yes
Name of the existing bridge or host interface: eth0
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]:

That’s it, you have a cluster (of one system) with networking and storage configured. Now, let’s join the other two boards by running sudo lxd init on them too:

stgraber@localhost:~$ sudo lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=localhost]: rpi02
What IP address or DNS name should be used to reach this node? [default=10.166.11.92]: 
Are you joining an existing cluster? (yes/no) [default=no]: yes
IP address or FQDN of an existing cluster node: 10.166.11.235
Cluster fingerprint: b9d2523a4935474c4a52f16ceb8a44e80907143e219a3248fbb9f5ac5d53d926
You can validate this fingerprint by running "lxc info" locally on an existing node.
Is this the correct fingerprint? (yes/no) [default=no]: yes
Cluster trust password: 
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "source" property for storage pool "local": 
Choose "size" property for storage pool "local": 20GB
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 
stgraber@localhost:~$

And validate that everything looks good by running sudo lxc cluster list on any of them:

stgraber@localhost:~$ sudo lxc cluster list
+-------+----------------------------+----------+--------+-------------------+--------------+
| NAME  |            URL             | DATABASE | STATE  |      MESSAGE      | ARCHITECTURE |
+-------+----------------------------+----------+--------+-------------------+--------------+
| rpi01 | https://10.166.11.235:8443 | YES      | ONLINE | fully operational | aarch64      |
+-------+----------------------------+----------+--------+-------------------+--------------+
| rpi02 | https://10.166.11.92:8443  | YES      | ONLINE | fully operational | aarch64      |
+-------+----------------------------+----------+--------+-------------------+--------------+
| rpi03 | https://10.166.11.200:8443 | YES      | ONLINE | fully operational | aarch64      |
+-------+----------------------------+----------+--------+-------------------+--------------+
stgraber@localhost:~$

And you’re done, chances are you won’t be needing to SSH to those boards ever again, everything else can now be done remotely through the LXD command line client or API from any Linux, Windows or macOS systems.

Operation

Now on any system that you want to use this newly setup cluster, run (updating the IP to match that of any of your boards):

stgraber@castiana:~$ lxc remote add my-cluster 10.166.11.235
Certificate fingerprint: b9d2523a4935474c4a52f16ceb8a44e80907143e219a3248fbb9f5ac5d53d926
ok (y/n)? y
Admin password for my-cluster: 
Client certificate stored at server:  my-cluster
stgraber@castiana:~$ lxc remote switch my-cluster
stgraber@castiana:~$ lxc cluster list
+-------+----------------------------+----------+--------+-------------------+--------------+----------------+
| NAME  |            URL             | DATABASE | STATE  |      MESSAGE      | ARCHITECTURE | FAILURE DOMAIN |
+-------+----------------------------+----------+--------+-------------------+--------------+----------------+
| rpi01 | https://10.166.11.235:8443 | YES      | ONLINE | fully operational | aarch64      |                |
+-------+----------------------------+----------+--------+-------------------+--------------+----------------+
| rpi02 | https://10.166.11.92:8443  | YES      | ONLINE | fully operational | aarch64      |                |
+-------+----------------------------+----------+--------+-------------------+--------------+----------------+
| rpi03 | https://10.166.11.200:8443 | YES      | ONLINE | fully operational | aarch64      |                |
+-------+----------------------------+----------+--------+-------------------+--------------+----------------+

Whenever you want to interact with your local system rather than the remote cluster, just run:

lxc remote switch local

Because no 64-bit Arm VM images are currently setup for secureboot, let’s disable it altogether with:

lxc profile set default security.secureboot false

And now let’s start some containers and virtual-machines:

lxc launch images:alpine/edge c1
lxc launch images:archlinux c2
lxc launch images:ubuntu/18.04 c3
lxc launch images:ubuntu/20.04/cloud v1 --vm
lxc launch images:fedora/32/cloud v2 --vm
lxc launch images:debian/11/cloud v3 --vm

The initial launch operations will be rather slow, especially if using the main microSD card for storage. However creating more of those instances should then be quite quick thanks to the image being already ready.

stgraber@castiana:~$ lxc list
+------+---------+------------------------+---------------------------------------------------+-----------------+-----------+----------+
| NAME |  STATE  |          IPV4          |                       IPV6                        |      TYPE       | SNAPSHOTS | LOCATION |
+------+---------+------------------------+---------------------------------------------------+-----------------+-----------+----------+
| c1   | RUNNING | 10.166.11.113 (eth0)   | fd42:4c81:5770:1eaf:216:3eff:fed7:35c0 (eth0)     | CONTAINER       | 0         | rpi01    |
+------+---------+------------------------+---------------------------------------------------+-----------------+-----------+----------+
| c2   | RUNNING | 10.166.11.88 (eth0)    | fd42:4c81:5770:1eaf:216:3eff:feaa:81ac (eth0)     | CONTAINER       | 0         | rpi02    |
+------+---------+------------------------+---------------------------------------------------+-----------------+-----------+----------+
| c3   | RUNNING | 10.166.11.146 (eth0)   | fd42:4c81:5770:1eaf:216:3eff:fe79:75a5 (eth0)     | CONTAINER       | 0         | rpi03    |
+------+---------+------------------------+---------------------------------------------------+-----------------+-----------+----------+
| v1   | RUNNING | 10.166.11.200 (enp5s0) | fd42:4c81:5770:1eaf: 216:3eff:fe61:1ad6 (enp5s0)  | VIRTUAL-MACHINE | 0         | rpi01    |
+------+---------+------------------------+---------------------------------------------------+-----------------+-----------+----------+
| v2   | RUNNING | 10.166.11.238 (enp5s0) | fd42:4c81:5770:1eaf:216:3eff:fe8d:e6ae (enp5s0)   | VIRTUAL-MACHINE | 0         | rpi02    |
+------+---------+------------------------+---------------------------------------------------+-----------------+-----------+----------+
| v3   | RUNNING | 10.166.11.33 (enp5s0)  | fd42:4c81:5770:1eaf:216:3eff:fe86:526a (enp5s0)   | VIRTUAL-MACHINE | 0         | rpi03    |
+------+---------+------------------------+---------------------------------------------------+-----------------+-----------+----------+

From there on, everything should feel pretty normal. You can use lxc exec to directly run commands inside those instances, lxc console to access the text or VGA console, …

Conclusion

LXD makes for a very easy and flexible solution to setup a lab, be it at home with a very Rasperry Pi boards, in the cloud using public cloud instances or on any spare hardware you may have around.

Adding additional servers to a cluster is fast and simple and LXD clusters even support mixed architectures, allowing you to mix Raspberry Pi 4 and Intel NUCs into a single cluster capable of running both Intel and Arm workloads.

Everything behaves in most the same way as running on your laptop or desktop computer but it can now be accessed by multiple users from any system with network access.

From this you could grow to a much larger cluster setup, using projects to handle multiple distinct uses of the cluster, attaching remote storage, using virtual networking, integrating with MAAS or with Canonical RBAC, … There are a lot of options which can be progressively added to a setup like this as you feel the need for it.

Bill_Knaffl · October 3, 2020, 2:32pm

This is a great concept. The only area in which I was concerned was that of the ubuntu cloud account. Is that needed or are there ways to do this without the ubuntu cloud account tie in?

I wanted to do something like inside the network but the idea of the cloud connection seems counter to the “inside” concept.

stgraber · October 3, 2020, 2:48pm

Yeah, that’s a limitation of Ubuntu Core I believe.

You can do the exact same as above using a traditional Ubuntu Server image instead:

This will not require any account and should come with the LXD snap preinstalled but unconfigured (so you won’t need the initial remove/delete/unset commands in this post), the rest will behave the same.

Bill_Knaffl · October 3, 2020, 2:53pm

I was looking at the concept – I had been playing around with different methods of clustering 4 RaspiPi 4 4gb together for “a home lab” and a friend pointed me here. GREAT write up.

I will probably try both methods - but if I got with the unbuntu side, I will likely drop it on the IoT lan so its not on the internal lan (likely a good place to play anyway)!

Update: Downloaded the UbuntuServer Image you pointed me to - I will update you as to how it went

Christhepow · October 3, 2020, 8:41pm

Was looking into doing this with proxmox instead. No external account required. They have a write-up on how as well. Like having multiple options though. Great write-up.

Bill_Knaffl · October 5, 2020, 12:56am

So I had a few issues with “Error: Failed to join cluster: Failed request to add member: The joining server version doesn’t (expected 4.0.2 with API count 189)”
But I didnt find an awful lot on that specifically. So, I reloaded. And with 1 of the 2 nodes, a reload seemed to work on the first shot, the second took two shots.

Bill_Knaffl · October 5, 2020, 1:18am

BTW - this was with the UBUNTU server

stgraber · October 5, 2020, 2:13am

A LXD cluster has either one database server (non-HA mode) or has three database servers providing with HA for the database.

So your output is quite normal for a 4 nodes cluster.

The error you got about mismatch is what I would expect if either the existing or the joining servers were running on a different version. For clustering to work, all servers must be on the same version of LXD which includes the bugfix release too.

One way to make sure of that would be to run snap refresh lxd prior to attempting a join.

Bill_Knaffl · October 5, 2020, 2:34am

Thanks! So being a TOTAL n00b (if not already abundantly clear), what is the fastest learning path??

I started down the clustering aiming to learn if there was a faster way to build a password cracking rig in Raspberry Pi world Now I have a working clusterm, time to expand into that arena.

stgraber · October 5, 2020, 2:50am

Well, now you can run anything you want on there, in containers or virtual-machines.
For something like password cracking, the most efficient would probably be a single container per cluster member and you’d then want to split the dictionary or brute force space so that they each get a quarter of the total but that’s really up to you and what you want to run on it

LRP · November 2, 2020, 6:46pm

I’m considering System76 NUCs for my LXD appliance cluster. Can provision with 64Gb RAM, 1 500 Gb (or larger) NVMe SSD OS drive, 1 500Gb (or larger) SSD.

Two questions:

What formula or metrics can I use to determine how many LXC containers I can support?
What would be the best way to configure reliable storage?

Thanks,

LRP

stgraber · November 2, 2020, 7:33pm

It’s going to be very dependent on what runs in the containers as that’s really what will eat your CPU and memory.

You can stuff 1000-2000 Ubuntu containers that do nothing with minimal CPU usage and only a few GBs of RAM, but that’s not really all that useful

The kernel does start getting a bit slow when reaching several thousand containers but again, not really a problem many people have

For storage, I’d recommend ZFS with RAID-1 and then configure LXD to store everything on it. At that point you need very little extra space for the host OS itself.

You could probably get two 1TB NVME drives and then setup everything to be redundant:

EFI partition (needs manual sync between the two drives) - 500MB
Root filesystem on software RAID (mdadm) - 25GB
ZPOOL with native RAIDZ1 - 970GB

That way you should be tolerant to either drive dying at any point in time.

papa1980 · November 7, 2020, 9:45am

Dear all,

I have create it not by this guid but by standart gude of lxd cluster configuration. However I am using at moment on wifi. When I try to stop lxc machine via command lxc stop name_of_machine it hang.
Any ideas. I am using ubuntu desktop 20.10

Best regards,
Nini

papa1980 · November 7, 2020, 6:08pm

Hello all,

I follow your guide and is work but i dont have internet to container. How they access internet?

oot@localhost:~# lxc list
±-------±--------±-----±-----±----------±----------±----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION |
±-------±--------±-----±-----±----------±----------±----------+
| proxis | RUNNING | | | CONTAINER | 0 | localhost |
±-------±--------±-----±-----±----------±----------±----------+
| web1 | RUNNING | | | CONTAINER | 0 | pi2 |
±-------±--------±-----±-----±----------±----------±----------+
| web2 | RUNNING | | | CONTAINER | 0 | localhost |
±-------±--------±-----±-----±----------±----------±----------+
root@localhost:~# lxc cluster list
±----------±-------------------------±---------±-------±------------------±-------------+
| NAME | URL | DATABASE | STATE | MESSAGE | ARCHITECTURE |
±----------±-------------------------±---------±-------±------------------±-------------+
| localhost | https://192.168.0.5:8443 | YES | ONLINE | fully operational | aarch64 |
±----------±-------------------------±---------±-------±------------------±-------------+
| pi2 | https://192.168.0.2:8443 | NO | ONLINE | fully operational | aarch64 |
±----------±-------------------------±---------±-------±------------------±-------------+
root@localhost:~#

root@localhost:~# lxc config show --expanded proxis
architecture: aarch64
config:
image.architecture: arm64
image.description: Ubuntu focal arm64 (20201107_07:42)
image.os: Ubuntu
image.release: focal
image.serial: “20201107_07:42”
image.type: squashfs
image.variant: default
security.secureboot: “false”
volatile.base_image: eebf1b0bbf608bfa5754e33b7310c02422d56d775e18c1494dccaf71068038b9
volatile.eth0.hwaddr: 00:16:3e:f1:b9:af
volatile.idmap.base: “0”
volatile.idmap.current: ‘[{“Isuid”:true,“Isgid”:false,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000},{“Isuid”:false,“Isgid”:true,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000}]’
volatile.idmap.next: ‘[{“Isuid”:true,“Isgid”:false,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000},{“Isuid”:false,“Isgid”:true,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000}]’
volatile.last_state.idmap: ‘[{“Isuid”:true,“Isgid”:false,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000},{“Isuid”:false,“Isgid”:true,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000}]’
volatile.last_state.power: STOPPED
devices:
eth0:
name: eth0
nictype: macvlan
parent: wlan0
type: nic
root:
path: /
pool: local
type: disk
ephemeral: false
profiles:

default
stateful: false
description: “”

–

root@localhost:~# lxc exec proxis bash
root@proxis:~# ip address show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
11: eth0@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:16:3e:f1:b9:af brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::216:3eff:fef1:b9af/64 scope link
valid_lft forever preferred_lft forever
root@proxis:~#

Best regards
Nini

stgraber · November 7, 2020, 7:28pm

You can’t use macvlan with wifi devices.
This tutorial was meant for wired networking.

If wifi is your only option, ipvlan may work but will usually need a more involved configuration in the instances.

(Short version of the reason why macvlan can’t work is that WiFi ties access based on MAC devices, macvlan gets a separate MAC on your network for every instance, your wifi card or your access point will therefore drop all traffic from the instances)

stgraber · November 7, 2020, 7:28pm

Worth noting that ipvlan doesn’t work with virtual machines.

If you can somehow use wired, that’s really your best option here.

papa1980 · November 7, 2020, 7:40pm

Hello Stephane,

First of all thanks for information and fast respond. Great tutorial. This is sad news I will have network cables next week.

I try following:
adding bridge but not idea why state is pending
root@localhost:~# lxc network list
±-------±---------±--------±------------±--------±--------+
| NAME | TYPE | MANAGED | DESCRIPTION | USED BY | STATE |
±-------±---------±--------±------------±--------±--------+
| br1-c1 | bridge | YES | | 1 | PENDING |
±-------±---------±--------±------------±--------±--------+
| eth0 | physical | NO | | 0 | |
±-------±---------±--------±------------±--------±--------+
| wlan0 | physical | NO | | 3 | |
±-------±---------±--------±------------±--------±--------+
root@localhost:~#

Then I try to add device in order to have ip address

lxc config device add proxis eth0 nic nictype=bridged parent=br1-c1 name=eth0

root@localhost:~# lxc config device list proxis
eth0
root@localhost:~#

When I try to start container is give me

root@localhost:~# lxc start proxis
Error: Common start logic: Failed to start device “eth0”: Parent device “br1-c1” doesn’t exist
Try lxc info --show-log proxis for more info

no idea why I have this bridge device maybe because is pending
any idea how to change status of it.

I spend several evening to make cluster work trying with ubuntu workstation but after cluster was configure stop or start of lxc container was not possible they just hang.
Now with your method is working but I dont want to wait for cables

Best regards,
nini

stgraber · November 7, 2020, 8:06pm

So the problem with a normal local bridge is that you won’t be able to access the instances from the outside nor will you be able to access the instances from another one of the clustered systems.

Instead you could do:

lxc network delete br1-c1
lxc network create lxdfan0 --target pi2
lxc network create lxdfan0 --target localhost
lxc network create lxdfan0 bridge.mode=fan
lxc profile device remove proxis eth0
lxc profile device add proxis eth0 nic network=lxdfan0 name=eth0

This will let all your instances get IP and DNS, they’ll be able to reach each other and both the localhost and pi2 systems will be able to reach them.

However they won’t have address on your LAN (192.168.0.x) so won’t be reachable from other systems. For that part, you really want wired networking and macvlan.

papa1980 · November 7, 2020, 8:16pm

Hello Stephane,

Thanks for your fast reply. LXD have great support.
I execute command you give me but now network is in state create and I cant add it as device to

34 lxc config remove proxis eth0
235 lxc config remove
236 lxc config device
237 lxc config device remove proxis eth0
238 lxc network delete br1-c1
239 lxc network create lxdfan0 --target pi2
240 lxc network create lxdfan0 --target localhost
241 lxc network create lxdfan0 bridge.mode=fan
242 lxc profile device remove proxis eth0
243 lxc profile device add proxis eth0 nic network=lxdfan0 name=eth0
244 ip address show
245 lxc list
246 lxc profile device add proxis eth0 nic network=lxdfan0 name=eth0
247 lxc network list
248 history -10
249 history 10
250 history 20
root@localhost:~# lxc profile device add proxis eth0 nic network=lxdfan0 name=eth0
Error: Fetch profile: No such object
root@localhost:~# lxc network list
±--------±---------±--------±------------±--------±--------+
| NAME | TYPE | MANAGED | DESCRIPTION | USED BY | STATE |
±--------±---------±--------±------------±--------±--------+
| eth0 | physical | NO | | 0 | |
±--------±---------±--------±------------±--------±--------+
| lxdfan0 | bridge | YES | | 0 | CREATED |
±--------±---------±--------±------------±--------±--------+
| wlan0 | physical | NO | | 4 | |
±--------±---------±--------±------------±--------±--------+
root@localhost:~#