Simple clustering question: why is second node in cluster is not Database?

I am experimenting with LXD on Ubuntu 20.04 (in virtualboxes). I have tested this with LXD 4.0.3 (which is apparently what comes installed in 20.04) and 4.6 (which is what I get if I do snap remove and snap install lxd). On node1, I do:

sudo lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=lxd-node1]: 
What IP address or DNS name should be used to reach this node? [default=192.168.0.26]: 
Are you joining an existing cluster? (yes/no) [default=no]: 
Setup password authentication on the cluster? (yes/no) [default=yes]: 
Trust password for new clients: 
Again: 
Do you want to configure a new local storage pool? (yes/no) [default=yes]: 
Name of the storage backend to use (dir, lvm, zfs, btrfs) [default=zfs]: dir
Do you want to configure a new remote storage pool? (yes/no) [default=no]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: 
Would you like to create a new Fan overlay network? (yes/no) [default=yes]: 
What subnet should be used as the Fan underlay? [default=auto]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 

On the second node, I do:

lxd-node3:~$ sudo lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=lxd-node3]: 
What IP address or DNS name should be used to reach this node? [default=192.168.0.28]: 
Are you joining an existing cluster? (yes/no) [default=no]: yes
IP address or FQDN of an existing cluster node: 192.168.0.26
Cluster fingerprint: d153628c8839398b9aeb77735ea0bbd86050917e9c924379ff7cf0ca8a09a459
You can validate this fingerprint by running "lxc info" locally on an existing node.
Is this the correct fingerprint? (yes/no) [default=no]: yes
Cluster trust password: 
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "source" property for storage pool "local": 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 

After this, I get with lxc cluster list:

+-----------+---------------------------+----------+--------+-------------------+--------------+----------------+
|   NAME    |            URL            | DATABASE | STATE  |      MESSAGE      | ARCHITECTURE | FAILURE DOMAIN |
+-----------+---------------------------+----------+--------+-------------------+--------------+----------------+
| lxd-node1 | https://192.168.0.26:8443 | YES      | ONLINE | fully operational | x86_64       | default        |
+-----------+---------------------------+----------+--------+-------------------+--------------+----------------+
| lxd-node3 | https://192.168.0.28:8443 | NO       | ONLINE | fully operational | x86_64       | default        |
+-----------+---------------------------+----------+--------+-------------------+--------------+----------------+

Why is the second node (lxd-node3 in my case) NO? Should it not be replicated?

My understanding is that to avoid the risk of having a two node raft cluster (which isn’t supported because you don’t get high availability with this due to the inability to have quorum due to there being an even number of nodes) LXD avoids actually starting a ‘true’ cluster until such time as there are >= 3 nodes (and from there selects an odd number of nodes to be database nodes).

You can read more about this here:

2 Likes

What @tomp said is accurate. Note that this is not a “full” cluster only in terms of the internal dlite database, however it is a perfectly “full” cluster in terms of LXD (e.g. if you run lxc list you will see all containers on both nodes).

2 Likes

Thank you both!