Simple clustering question: why is second node in cluster is not Database?

I am experimenting with LXD on Ubuntu 20.04 (in virtualboxes). I have tested this with LXD 4.0.3 (which is apparently what comes installed in 20.04) and 4.6 (which is what I get if I do snap remove and snap install lxd). On node1, I do:

sudo lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=lxd-node1]: 
What IP address or DNS name should be used to reach this node? [default=192.168.0.26]: 
Are you joining an existing cluster? (yes/no) [default=no]: 
Setup password authentication on the cluster? (yes/no) [default=yes]: 
Trust password for new clients: 
Again: 
Do you want to configure a new local storage pool? (yes/no) [default=yes]: 
Name of the storage backend to use (dir, lvm, zfs, btrfs) [default=zfs]: dir
Do you want to configure a new remote storage pool? (yes/no) [default=no]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: 
Would you like to create a new Fan overlay network? (yes/no) [default=yes]: 
What subnet should be used as the Fan underlay? [default=auto]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 

On the second node, I do:

lxd-node3:~$ sudo lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=lxd-node3]: 
What IP address or DNS name should be used to reach this node? [default=192.168.0.28]: 
Are you joining an existing cluster? (yes/no) [default=no]: yes
IP address or FQDN of an existing cluster node: 192.168.0.26
Cluster fingerprint: d153628c8839398b9aeb77735ea0bbd86050917e9c924379ff7cf0ca8a09a459
You can validate this fingerprint by running "lxc info" locally on an existing node.
Is this the correct fingerprint? (yes/no) [default=no]: yes
Cluster trust password: 
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "source" property for storage pool "local": 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 

After this, I get with lxc cluster list:

+-----------+---------------------------+----------+--------+-------------------+--------------+----------------+
|   NAME    |            URL            | DATABASE | STATE  |      MESSAGE      | ARCHITECTURE | FAILURE DOMAIN |
+-----------+---------------------------+----------+--------+-------------------+--------------+----------------+
| lxd-node1 | https://192.168.0.26:8443 | YES      | ONLINE | fully operational | x86_64       | default        |
+-----------+---------------------------+----------+--------+-------------------+--------------+----------------+
| lxd-node3 | https://192.168.0.28:8443 | NO       | ONLINE | fully operational | x86_64       | default        |
+-----------+---------------------------+----------+--------+-------------------+--------------+----------------+

Why is the second node (lxd-node3 in my case) NO? Should it not be replicated?

My understanding is that to avoid the risk of having a two node raft cluster (which isnā€™t supported because you donā€™t get high availability with this due to the inability to have quorum due to there being an even number of nodes) LXD avoids actually starting a ā€˜trueā€™ cluster until such time as there are >= 3 nodes (and from there selects an odd number of nodes to be database nodes).

You can read more about this here:

2 Likes

What @tomp said is accurate. Note that this is not a ā€œfullā€ cluster only in terms of the internal dlite database, however it is a perfectly ā€œfullā€ cluster in terms of LXD (e.g. if you run lxc list you will see all containers on both nodes).

2 Likes

Thank you both!