LXD cluster - launch command creates container on different node - is it possible to disable this by default?

I have LXD cluster. When I create new container with lxd launch, containers are created automatically on a random node. I think by default LXD tries to load balance, or uses round robin or some other method.

I have two problems.

  1. My infrastructure is small, and I want to manually control on which hosts I want to start container, i.e. my API server should be not share nodes with containers redis or static-web containers. I can avoid this with --target parameter.
  2. But most of the time I will be creating containers with Ansible. Ansible plugin does not support “–target”. (At least, to best of my knowledge.)
    That means, if I have cluster with 2 nodes in it, those nodes will take turn in hosting new container. That means my ansible will be failing every second time.

Is it possible to configure LXD cluster to disable this behaviour?

Thanks for your time everyone.

Additional issue https://github.com/ansible/ansible/issues/40479

I don’t think there’s much that can be done on the LXD side. If you want explicit placement, --target is the option to use (or ?target= if you are using the REST API). Without explicit placement I don’t see a way which LXD can guess the right placement for all possible workflows it might used for.

Instead of guessing just disable that behaviour so it would always create container on itself.

Okay that might be a reasonable knob to have I guess. @stgraber thoughts? That’d be basically a way to change the default placement algorithm.

Thanks freekanayaka, it would be nice to have that feature, not that it must or should be present.

After having a long conversation with someone on IRC, I think it would be more reasonable for lxd module in ansible to enable support of --target parameter. LXD has it and Ansible is missing it.

1 Like

If you guys do decide to this, would you consider adding “reaching out” to URL’s for “roll your own” ?

This is related to issue of this subject. I am trying to figure out how does LXD API behaves with creating instances and targets.

Here is my JSON file (names edited), data.json:

{
 "name": "test-4",
 "source": {"type": "image",
          "alias": "my-template"}
}

curl command:
curl -XPOST -k --cert x.crt --key x.key https://host-01:8443/1.0/instances?target=host-01 --data @data.json | jq
Instance created in first host, here is JSON result:

{
  "type": "async",
  "status": "Operation created",
  "status_code": 100,
  "operation": "/1.0/operations/84b22a5d-1c6e-48be-af95-d8e70ac31d8a",
  "error_code": 0,
  "error": "",
  "metadata": {
    "id": "84b22a5d-1c6e-48be-af95-d8e70ac31d8a",
    "class": "task",
    "description": "Creating container", 
    "created_at": "2020-07-13T21:19:10.866729854+01:00",
    "updated_at": "2020-07-13T21:19:10.866729854+01:00", 
    "status": "Running",
    "status_code": 103,
    "resources": {
      "containers": [
        "/1.0/containers/test-4"
      ],
      "instances": [
        "/1.0/instances/test-4"
      ]
    },
    "metadata": null,
    "may_cancel": false,
    "err": "",
    "location": "host-01"
  }
}

lxs ls:

$ lxc ls
+-----------+---------+------------------------+------+-----------+-----------+----------+
|   NAME    |  STATE  |          IPV4          | IPV6 |   TYPE    | SNAPSHOTS | LOCATION |
+-----------+---------+------------------------+------+-----------+-----------+----------+
| test-4    | STOPPED |                        |      | CONTAINER | 0         | host-01  |
+-----------+---------+------------------------+------+-----------+-----------+----------+

Now I will try to repeat the command, and specify different target. Normally, if I do it with lxc, it will not allow me, regardless of target. With curl it is different, and that is what I do not understand, what is happening. If LXC clsuter tried move the container to another host - that does not happen. And If ansible will try to do same, using LXD API, and achieve nothing, then it will confuse the user. So, what am I doing wrong or misunderstanding?

First, repeating the same command:
curl -XPOST -k --cert x.crt --key x.key https://host-01:8443/1.0/instances?target=host-01 --data @data.json | jq

{
  "type": "async", 
  "status": "Operation created",
  "status_code": 100,
  "operation": "/1.0/operations/76638686-bd4f-45b3-93ba-33e8c1737df5",
  "error_code": 0,
  "error": "",
  "metadata": {
    "id": "76638686-bd4f-45b3-93ba-33e8c1737df5",
    "class": "task",
    "description": "Creating container",
    "created_at": "2020-07-13T21:19:19.911966704+01:00",
    "updated_at": "2020-07-13T21:19:19.911966704+01:00",
    "status": "Running",
    "status_code": 103,
    "resources": {
      "containers": [
        "/1.0/containers/test-4"
      ],
      "instances": [
        "/1.0/instances/test-4"
      ]
    },
    "metadata": null,
    "may_cancel": false,
    "err": "",
    "location": "host-01"
  }
}

repeating same command, but with different target:
curl -XPOST -k --cert x.crt --key x.key https://host-01:8443/1.0/instances?target=host-02 --data @data.json | jq

{
  "type": "async",
  "status": "Operation created",
  "status_code": 100,
  "operation": "/1.0/operations/07b7b322-3b84-4086-85bd-1b823f1d9742?project=default",
  "error_code": 0,
  "error": "",
  "metadata": {
    "id": "07b7b322-3b84-4086-85bd-1b823f1d9742",
    "class": "task",
    "description": "Creating container",
    "created_at": "2020-07-13T21:19:31.264413533+01:00",
    "updated_at": "2020-07-13T21:19:31.264413533+01:00",
    "status": "Running",
    "status_code": 103,
    "resources": {
      "containers": [
        "/1.0/containers/test-4"
      ],
      "instances": [
        "/1.0/instances/test-4"
      ]
    },
    "metadata": null,
    "may_cancel": false,
    "err": "",
    "location": "host-02"
  }
}

However, lxc ls disagrees with this:
$ lxc ls

+-----------+---------+------------------------+------+-----------+-----------+----------+
|   NAME    |  STATE  |          IPV4          | IPV6 |   TYPE    | SNAPSHOTS | LOCATION |
+-----------+---------+------------------------+------+-----------+-----------+----------+
| test-4    | STOPPED |                        |      | CONTAINER | 0         | host-01  |
+-----------+---------+------------------------+------+-----------+-----------+----------+

What I am trying to achieve is to test, in case Ansible user tries to run same script again - same API call will be executed. But API does not behave same as LXC command would. Should ansible developer perform additional checks before running API, to check whether instance exists already, and if yes - instead of POST, decide whether to do UPDATE?

Let’s say there is ansible scenario described by ansible user:
Instance A to be deployed on nodeA.
Ansible developer will decide to have these rules then.

  • check if container with this name already exists
  • if not, create container - send POST API call
  • if container already exists, then check if it is already exists on requested node - do nothing
  • if container already exists, but not on the node it is requested - move it to another node requested. Should above POST worked for this, if not - then what?

Just FYI, my storage type is directory.

Thanks

The reason is that POST /1.0./instances is asynchronous . You receive back the ID of an operation that tracks the progress of the container creation, and to know whether the creation succeed or failed you need to wait for that operation to complete. You can do that with something like:

op=$(curl -XPOST -k --cert x.crt --key x.key https://host-01:8443/1.0/instances?target=host-01 --data @data.json | jq -r .operation)
curl -k --cert x.crt --key x.key https://host-01:8443$op/wait

And that should give you back something like this:

{
  "type": "sync",
  "status": "Success",
  "status_code": 200,
  "operation": "",
  "error_code": 0,
  "error": "",
  "metadata": {
    "id": "98a96cc2-d48f-4d17-bec9-7b637f721ff1",
    "class": "task",
    "description": "Creating container",
    "created_at": "2020-07-14T08:29:46.211091569Z",
    "updated_at": "2020-07-14T08:29:46.211091569Z",
    "status": "Failure",
    "status_code": 400,
    "resources": {
      "containers": [
        "/1.0/containers/test-4"
      ],
      "instances": [
        "/1.0/instances/test-4"
      ]
    },
    "metadata": null,
    "may_cancel": false,
    "err": "Create instance: Add instance info to the database: This instance already exists",
    "location": "host-01"
  }
}

Note the err field in the above payload. That’s basically what the lxc CLI does as well.

Just an update, there is pull request in github which will introduce --target feature to ansible lxd_cluster module. That is where I needed it.
Ansible community will evaluate those code changes, but I am afraid that will not include someone actually testing it on their own cluster.
I had tested myself and it works for my case, however I did not perform any rigorous tests, guess we will find out in future if anyone has problems.