I understand that Incus needs at least 3 hosts in a cluster to find a majority in case of one host is lost and comes back later, but maybe it doesn’t need 2 hosts to run its workload. Would it be possible to run Incus on 2 “heavy duty” control + worker nodes, plus an additional control host (eg a small “entry level” server without huge storage and CPU power) making sure a majority can be found?
In a cluster setup of three nodes, they all share the distributed database and it’s up to the admin to specify where to launch the containers/VMs.
It should be possible to just avoid launching containers/VMs on the lightweight server, but keep it for the quorum of the cluster.
Someone more knowledgeable can verify whether the processing needs for the distributed database can be satisfied with a lightweight server.
In fact if you can put the lightweight server on a small UPS, it should be good for the resilience of the cluster.
From a purely technical point of view, I don’t see any reason why this should not work. But there are a number of reasons why it is not a good idea in my opinion:
- One main purpose of running a cluster instead of a bunch of individual servers is redundancy: You can shut down one of the hosts (for maintenance, or if there is an error) without disrupting service (or at least without disrupting service for longer than it takes to move the hosted VMs/containers away from that host). But that means you need enough reserves on the other hosts to take up the slack. Which means with a two-host-cluster, you can only have a standard load of 50% on each host, whereas with a three-host-cluster, you can load each node to 66%.
- When you run an Incus cluster with “standard” storage (BTRFS, ZFS, LVM), you need to move the stored data whenever you move a VM/Container. This will take quite a while (several minutes up to hours, depending on how much data has to be transferred). To avoid that, you’ll probably want to use Ceph as the storage backend. Which also requires at least three nodes (specifically, three “mon” nodes), and strongly encourages to a have at least three storage nodes (“osd” nodes) on different servers as well. That means your “entry level” server needs to have at least enough oomph to handle Ceph.
- And finally, having identical (or almost identical) configuration on all of the hosts makes them easier to administrate.
All in all, I’d recommend to go with a 3-node-cluster of identical nodes.
Edit: Fixed typos
Just some thoughts. I have been running clusters since day one in LXD and Incus is no different, though infinitely more reliable.
For mission-critical applications with paying customers, you need 4.
Because if one goes down you still need the three to work so you can access your cluster and containers properly. So does that mean you need 4 servers… sort of.
You can pick up something like this for a few hundred dollars
I have 3 of these in two different customers. If I need to do maintenance I swap containers over to another unit, and service the node. If the node CPU/ram mb is bad, I swap disk with hot spare.
The only problem is keeping all the machines software updated on same revisions, and then getting stuck sometimes.
In a cluster, you better have lots of extra disk space and RAM, because you may need to move containers over to service machine. BTW sometimes instead of moving a whole machine, terabytes of data, it is quicker to swap disks than do a cluster move. I also learned, to keep my Container drive separate from My boot drive.
Not strictly true. I have a three node setup running here (also for years, and admittedly still running LXD until I finally get the time to migrate it to Incus) which works fine with one host down. It will crash and burn though when two hosts go down at the same time - so I’d recommend against leaving it in a degraded state for long…
Yep, that is a problem. Especially when the database version changes, and the newly upgraded nodes decide they need to wait until all the other nodes have been upgraded as well before resuming operations.
From my experience, the order of importance is
- Network connection between the nodes. Have at least 10Gbit between each of the nodes (might be specific to Ceph though, I don’t have experience with clusters with other storage backends)
- RAM. Can’t have enough.
- Storage speed (also, might be Ceph specific - Ceph does have quite a bit of overhead, especially on writes.
- CPU.
Note that a 4-node cluster also crashes and burns if two nodes are down (because you need greater than 50% active to form a quorum). So if you want to allow for two nodes being down, you need to build a 5-node cluster.
Personally, I don’t use incus clustering. For me, the benefits of having a single view of all containers are outweighed by the various cluster error scenarios which are difficult to recover from - problems during node upgrades are just one example. From my point of view, clustering gives lower overall availability and a higher risk of catastrophic failure.
Tools like “lxc copy” and “lxc move” work between clusters, and hence also between a set of independent machines (which are in effect 1-node standalone clusters)
Also using a front end like LXConsole allows you to see all the individual nodes in one interface without needing them to be clustered.
Given power costs in the UK I’ve been consolidating down my workloads as much as possible so I can turn things off when not in use. My minimum running environment now is one (primarily) storage server, one general purpose server, and one low power quorum/watch node on an N100 box which can restore the two main nodes if required. Then the three other servers (which run a k8s cluster) can be off unless I’m actually using them. With this setup Incus clustering doesn’t make sense - the 24/7 running nodes are too asymmetric in size and function.