LXD cluster setup problem: How should I set storage pools for slave nodes?

Park_Kyung_Won · April 13, 2018, 1:31pm

I’ve created master node for LXD cluster

then I tried to initialized LXD on slave nodes to add them to existing cluster.

But when I get prompted of these questions, I don’t know what to enter

Choose the local disk or dataset for storage pool "local" (empty for loop disk): 
Choose the local disk or dataset for storage pool "remote" (empty for loop disk):

Should I enter device name("/dev/sda*") in there? or should I give ZFS volume name?
I’ve tried many times but all trials gave me this message

Error: Failed to update storage pool 'local': node-specific config key source can't be changed

Could anyone explain me about concept of “local” and “remote” pools of LXD cluster in LXD 3.0

freeekanayaka · April 13, 2018, 1:45pm

Hello,

please can you paste the output of:

lxc storage list
lxc storage show local --target <node name>
lxc storage show remote --target <node name>

You have to run these commands on the master node that you have already created, and should be the name you gave to the master node.

Cheers

Park_Kyung_Won · April 13, 2018, 1:50pm

Output of “lxc storage list”

+--------+-------------+--------+---------+---------+
|  NAME  | DESCRIPTION | DRIVER |  STATE  | USED BY |
+--------+-------------+--------+---------+---------+
| local  |             | zfs    | CREATED | 2       |
+--------+-------------+--------+---------+---------+
| remote |             | lvm    | CREATED | 0       |
+--------+-------------+--------+---------+---------+

lxc storage show local --target node01

config:
  size: 15GB
  source: /var/snap/lxd/common/lxd/disks/local.img
  zfs.pool_name: local
description: ""
name: local
driver: zfs
used_by:
- /1.0/containers/test2
- /1.0/profiles/default
status: Created
locations:
- node00
- node01

lxc storage show remote --target node01

config:
  lvm.thinpool_name: LXDThinpool
  lvm.vg_name: remote
description: ""
name: remote
driver: lvm
used_by: []
status: Created
locations:
- node00
- node01

These are outputs of each commands
I did not setup “remote” storage in master node is it ok?

I was wondering of meanings of “local” and "remote"
I thought “remote” storage is like adding a storage pool which isn’t on this machine

freeekanayaka · April 13, 2018, 1:53pm

If you didn’t setup “remote” storage in the master node, it’s weird that you still got a remote storage pool created. Do you still have the console output of the “lxd init” command you ran on the master node? To see exactly how you answered to the various questions.

Park_Kyung_Won · April 13, 2018, 1:55pm

You mean lxd init yaml output?

Still this is testing setup so I’ll bomb existing lxd cluster and make new one

But could you explain what “remote” storage is for?

And is there any other way to uninstall lxd cleanly or doing “lxd init” again will do everything fine?

freeekanayaka · April 13, 2018, 1:59pm

The YAML output would be fine too, if you opted for printing it. I meant the exact log of the questions and answers (so cut and paste from the terminal your “lxd init” session).

The remote storage does make sense only if you have a remote ceph storage that you want to use (@stgraber can you confirm this? these questions got a bit changed after the cobra merge and I’m not entirely sure either)

If you installed LXD via snap, you can just “snap remove lxd” and “snap install lxd” again to start from scratch.

If you used the deb, apt purge lxd and apt install lxd should do the same.

freeekanayaka · April 13, 2018, 2:01pm

If you start fresh, you most probably don’t want to create a remote storage pool. If you can use the zfs as “local” (there will be one zfs dataset on each node). After you built the cluster you can add more storage pools (e.g. LVM if you wish).

Park_Kyung_Won · April 13, 2018, 2:04pm

This is the output of new lxd init on master node

config:
  core.https_address: 192.168.0.1:8443
  core.trust_password: CENSORED
cluster:
  server_name: node00
  enabled: true
  cluster_address: ""
  cluster_certificate: ""
  cluster_password: ""
networks: []
storage_pools:
- config:
    source: /dev/sda6
  description: ""
  name: local
  driver: zfs
profiles:
- config: {}
  description: ""
  devices:
    root:
      path: /
      pool: local
      type: disk
  name: default

freeekanayaka · April 13, 2018, 2:05pm

Ok, so now when you join a node you should be able to select “/dev/sda6” (or whatever other device the node has) when you get asked about the local storage.

Park_Kyung_Won · April 13, 2018, 2:07pm

So, when I do “lxd init” on new slave node,

I should type '/dev/sda6" for prompt of

Choose the local disk or dataset for storage pool “local” (empty for loop disk):

?

How about “remote” storage?

I just pressed enter for I don’t want “remote” storage to be set up but output was like above
(“remote” was created anyway)

freeekanayaka · April 13, 2018, 2:11pm

Yes, you should type /dev/sda6 for the local pool question.

Regarding the remote storage pool, it feels like a recently introduced bug, since you answered NO in the lxd init question on the master and still asks about it on the joining node. @stgraber does it ring any bell?

Park_Kyung_Won · April 13, 2018, 2:13pm

Yes that’s what is happening.

I’ve answered “NO” for “remote” storage pool prompt on master node lxd init

But cluster join of new slave node asks me of “remote” storage pool again.

So how should I setup “remote” pool? Do I need CEPH specifically?

freeekanayaka · April 13, 2018, 2:28pm

As said, my impression is that this is a regression, so perhaps a fix is needed in the source code. Stephane who knows a bit more about this is currently on holiday and I’m about to end my day, but we’ll follow up as soon as we can.

To work around the problem, you can try to prepare a preseed.yaml file and use “lxd init --preseed < preseed.yaml” instead of “lxd init”, for both the master and the slaves. That will run in non-interactive mode and skip the questions.

There is some documentation about how to do that here:

github.com

lxc/lxd/blob/master/doc/clustering.md

# Clustering

LXD can be run in clustering mode, where any number of LXD instances
share the same distributed database and can be managed uniformly using
the lxc client or the REST API.

Note that this feature was introduced as part of the API extension 
"clustering".

## Forming a cluster

First you need to choose a bootstrap LXD node. It can be an existing
LXD instance or a brand new one. Then you need to initialize the
bootstrap node and join further nodes to the cluster. This can be done
interactively or with a preseed file.

Note that all further nodes joining the cluster must have identical
configuration to the bootstrap node, in terms of storage pools and
networks. The only configuration that can be node-specific are the
`source` and `size` keys for storage pools and the

This file has been truncated. show original

and you can use the YAML output that you already have as starting point for the master node (that should work perfectly if you start over and create the master node with that). The YAML for the slave nodes will be similar but you’ll need to add a few more details like the TLS certificate, master IP and password (see the document I linked).

Hope that helps.

Park_Kyung_Won · April 13, 2018, 2:29pm

I see

I’ll try with preseed

Thank you for your time !

freeekanayaka · April 13, 2018, 3:25pm

Where did you install LXD from? snap or deb? and if you installed it with snap, which channel did you use? (if you don’t specify a channel the default is to use the stable channel).

I just tried to reproduce your issue with the lastest lxd deb in Ubuntu 18.04, with snap from the stable channel and with snap with the edge channel. In all cases I didn’t get the issue of the remote pool being created if answered “no”, and joining nodes worked just fine (and I was asked only for the local pool when joining).

I tried to matched your config, the YAML output from lxd init on the master node in all three cases was:

config:
  core.https_address: OMITTED
  core.trust_password: OMITTED
cluster:
  server_name: node1
  enabled: true
  cluster_address: ""
  cluster_certificate: ""
  cluster_password: ""
networks: []
storage_pools:
- config:
    source: /dev/vdc
  description: ""
  name: local
  driver: zfs
profiles:
- config: {}
  description: ""
  devices:
    root:
      path: /
      pool: local
      type: disk
  name: default

And when joining a node the YAML output from lxd init was:

config:
  core.https_address: 10.55.60.34:8443
cluster:
  server_name: node2
  enabled: true
  cluster_address: 10.55.60.66:8443
  cluster_certificate: |
    -----BEGIN CERTIFICATE-----
    MIIFpzCCA4+gAwIBAgIRAJy2Mthh4UC/gfAXqVLCjYswDQYJKoZIhvcNAQELBQAw
    VjEcMBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzE2MDQGA1UEAwwtcm9vdEBs
    eGQtNGZhOGZkZGMtOGRkOC00NDUzLTk3NzctNDFkNjA3Njk4YzZlMB4XDTE4MDQx
    MzE1MDg1NloXDTI4MDQxMDE1MDg1NlowVjEcMBoGA1UEChMTbGludXhjb250YWlu
    ZXJzLm9yZzE2MDQGA1UEAwwtcm9vdEBseGQtNGZhOGZkZGMtOGRkOC00NDUzLTk3
    NzctNDFkNjA3Njk4YzZlMIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEA
    w9e0loMQL7h0YgF3nRTbmQcmh+m/5O3gY8pMvsfbdoX148mfNpNj42EFv3a6FryT
    CVZ/VRsgy20NNu5MFpF2O5JTGt727h+taYgB7ul3E4CU0Lag280MMn4fvi0+z9Np
    e6Nqm4XdkL4smMTRuID9KcP4jIxfblglJlITg1t8A8tLWHdsXTUploYvgq2ZAunr
    au+LEGOJMNqU8++WwPtlzpaS/XeGi4Wb1L2dmBkzVwQQn96TWHXahFr8973SbtCB
    4WO3r/izqGKpRtW5zOpln2FiwrdLE5pCieSkmfBBgj7pK3nT5zA2C9aUX1PgbmMl
    e7O3NgmFHp24m6E+46+CVVk7uNXJdEQsYyJeIEiL4UEephDwzEiVt2g0wXcifOqV
    r5mFgXKLjbZVfwR+cG9eBo/fjL+ERFvV40NL1jzN/ySLXwjOK43mN4o5kpNhzxrQ
    VhizA5GLbk/bZTxD1oXnRHhDHQVLl2miFDNUjXf7LzVm4OLlTYpOKnJ5cWPpfxsQ
    AANlVlVjIzwB7Eav586WMX9VGd1HxoloGfIojMdWfBDKGLK8dtuZd9UC0kvrDOKK
    zwWVaBKBNLt2GZ5wyBhjsUcJ3oaEUPfbPfpnTMNGe6LV/gADCZ5K18vkRsX57bo3
    DzjGykF2xCSrq5N4wnEqQgZv7K4tSCeMOEVf/8ZvV6cCAwEAAaNwMG4wDgYDVR0P
    AQH/BAQDAgWgMBMGA1UdJQQMMAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwOQYD
    VR0RBDIwMIIobHhkLTRmYThmZGRjLThkZDgtNDQ1My05Nzc3LTQxZDYwNzY5OGM2
    ZYcECjc8QjANBgkqhkiG9w0BAQsFAAOCAgEAToDB8pTS1AbpxBgnyvWJJiCyKWeD
    chG4DBGwxeR24aoAHWIr2htO//MNkj3x1mtyLlITZMBo2sm3o7Zn1MHC5bTkujHX
    pGn7+EpRvp8ccmE9E0zSoG22sAqISa1K36euxy6bXbDIgMaQRxZ1x8bJUjsmJWcR
    X5KuaDNaZEEro7I/Q3fiCBhRRsg7W+N8oT73gJxx4MM95OqzbXPs1s/Bp/Ap4aAM
    LgF7OpIzVNzFnSpd7QOnloELWNgNkvOtkfa766hZBFk1KEvs/TSPGPetLY+NOx6F
    zsGb54FheunIiXkAYHNtxsYRLNsMwNC4OHo9lQEgcA/NNlI2VDVxPw8q8Ne0FwAo
    voAAx//LAFgF+BXwVpsS4x6xcrTLzzifF46pHqLqGiNoAqI51aH0qk2cZdXi/CfM
    qdcCkaMh2FVLvFbayUYH/DXh3SJO5zL/CW4+QwHqdWDvg8AgQvCa8++V+Of/hSYw
    CzA2cVgDMQXZXIm7bQurPRv+ExtivEvCgwY0VjKq+kyimxHu6x7JFvq17XYfm/1L
    fH8xd5NNhP/MlV1CiFDX2eUbsbAzkSDJiaVzfNMppd8MeW+THvys2v4VCGxj2m/2
    ddLWiL6kI7DKGRPkqlcRWKZCFmfRWRpI68buQl45NiueZb7Bc49Ux7IIJ8kThnFz
    utoYUHGQbcm2lpk=
    -----END CERTIFICATE-----
  cluster_password: foo
networks: []
storage_pools:
- config:
    source: /dev/vdc
  description: ""
  name: local
  driver: zfs
profiles:
- config: {}
  description: ""
  devices: {}
  name: default

So I retract what I said before, there doesn’t seem to be a regression. Perhaps there’s something specific in your system (some LVM setup?) that makes this bug trigger.

Park_Kyung_Won · April 13, 2018, 3:32pm

I have installed LXD from snap, without changing channel (so it must be stable),

from Ubuntu 17.10 which was updated from Ubuntu 16.04

Park_Kyung_Won · April 13, 2018, 3:35pm

I’ve just tried again with new install of LXD, now its working well.

This time I’ve installed LXD fresh with snap remove and snap install

There must be some leftovers from previous install