Error 404 for "/1.0/operations/<uuid>/wait" REST call with Ansible

adosztal · July 27, 2018, 10:10am

I tried creating a container with Ansible’s lxd_container module from a remote host but the task failed. I saw in the logs that the Ansible module tried to call /1.0/operations/<uuid>/wait but received a 404:

{
    "request": {
        "json": null, 
        "method": "GET", 
        "timeout": null, 
        "url": "/1.0/operations/8346c91a-ba85-4ed2-a50a-38a9864f3296/wait"
    }, 
    "response": {
        "json": {
            "error": "not found", 
            "error_code": 404, 
            "type": "error"
        }
    }, 
    "type": "sent request"
}

The container was created but it didn’t start (I set its state to “started” in the playbook). Another strange thing is that running the same playbook for the 2nd time returns the same error (obviously) but still starts the container. If I wait until the container is booted and run the playbook again, the task runs without errors (but doesn’t change anything of course).

Is it missing from the version I use (3.0.1 on Ubuntu Bionic) or is it something with the Ansible module?

This is the playbook I was running; it’s pretty simple:

---
- hosts: lxd1
  connection: ssh
  tasks:
  - name: Create new container
    lxd_container:
      name: test2
      state: started
      wait_for_ipv4_addresses: true
      source:
        type: image
        fingerprint: 38219778c2cf
      profiles: ["test"]
      devices:
        eth1:
          name: eth1
          nictype: macvlan
          parent: ens4
          type: nic
        eth2:
          name: eth2
          nictype: macvlan
          parent: ens5
          type: nic
          vlan: "2"

Note: I tried “wait_for_ipv4_addresses” with both true and false values.

adosztal · July 27, 2018, 12:29pm

This is a 3 node cluster + another Linux host running Ansible. I did some further checks and noticed the following:

When I ran the playbook on the same node where the container was created, there were no issues: the container was created and started.
I had same issue when I ran the playbook on any other cluster node.

Output of the “wait” request of the successful run:

    {
        "request": {
            "json": null, 
            "method": "GET", 
            "timeout": null, 
            "url": "/1.0/operations/8eff6bec-bbf4-42b5-8207-96828566ea36/wait"
        }, 
        "response": {
            "json": {
                "error": "", 
                "error_code": 0, 
                "metadata": {
                    "class": "task", 
                    "created_at": "2018-07-27T12:24:37.140389624Z", 
                    "description": "Starting container", 
                    "err": "", 
                    "id": "8eff6bec-bbf4-42b5-8207-96828566ea36", 
                    "may_cancel": false, 
                    "metadata": null, 
                    "resources": {
                        "containers": [
                            "/1.0/containers/vpn2"
                        ]
                    }, 
                    "status": "Success", 
                    "status_code": 200, 
                    "updated_at": "2018-07-27T12:24:37.140389624Z"
                }, 
                "operation": "", 
                "status": "Success", 
                "status_code": 200, 
                "type": "sync"
            }
        }, 
        "type": "sent request"
    }, 
[...]

stgraber · July 27, 2018, 2:01pm

There’s a bug in /1.0/operations/UUID/wait when clustered which we fixed in LXD 3.3, LXD 3.0.2 will get the fix backported.

Basically the bug is that only operations local to the machine you sent the request to would be shown, instead of showing all operations across the entire cluster.

stgraber · July 27, 2018, 2:03pm

adosztal · July 27, 2018, 5:45pm

Thank you, I’ll wait for the update then.

adosztal · November 6, 2018, 7:02pm

I can confirm this is fixed.