Lxc move error websocket: bad handshake

So, I have 5 node cluster with zfs storage. After update to 4.20, I can’t move in any container from any node to one specific node. Lets call it “atl1”. Containers already existing on this node, and newly created can be moved out without any issues.

Container “test1” is stopped. When i issue “lxc move test1 --target atl1” i get error:
“Error: Copy instance operation failed: Failed instance creation: Error transferring instance data: websocket: bad handshake”

it doesn’t matter if i issue this from a remote or a node which is participating in transfer.

Can you move containers to other members of the cluster OK?

When I was writing my first post i was able to move containers between other nodes. But now I have problems. Between some i can move containers both way, between others only one way. I can’t spot any pattern. Additionally “lxc cluster list” return that all nodes are fully operational.

But I still can create new containers an all nodes without problems.

Can you show lxc cluster ls

±----------±-----------------------------±-----------------±-------------±---------------±------------±-------±------------------+
| NAME | URL | ROLES | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATE | MESSAGE |
±----------±-----------------------------±-----------------±-------------±---------------±------------±-------±------------------+
| atl1 | https://172.31.255.10:8443 | database | x86_64 | default | | ONLINE | Fully operational |
±----------±-----------------------------±-----------------±-------------±---------------±------------±-------±------------------+
| kriszos11 | https://192.168.111.181:8443 | database-standby | x86_64 | default | | ONLINE | Fully operational |
±----------±-----------------------------±-----------------±-------------±---------------±------------±-------±------------------+
| naslxd | https://10.0.1.12:8443 | database | x86_64 | default | | ONLINE | Fully operational |
±----------±-----------------------------±-----------------±-------------±---------------±------------±-------±------------------+
| nazwa1 | https://172.31.255.6:8443 | database | x86_64 | default | | ONLINE | Fully operational |
±----------±-----------------------------±-----------------±-------------±---------------±------------±-------±------------------+
| pc | https://10.0.1.104:8443 | database-standby | x86_64 | default | | ONLINE | Fully operational |
±----------±-----------------------------±-----------------±-------------±---------------±------------±-------±------------------+
P

Right so your cluster members are on different subnets, possibly with router(s) and firewalls(s) between them?

Have you ensure that all members can communicate bi-directionally with each other (rather than just the leader sending heartbeats to each member)?

Also check that there are no firewalls that are potentially altering the TLS negotiation for connections too.

It never was a problem that my nodes were on different subnets. I checked, and I’m able to establish tcp connection to 8443 port from every node to every other node. My firewalls between them are not sophisticated enough to do TLS inspection.

to further check i added all nodes as remotes on all nodes, like that:

lxc remote add --accept-certificate atl1 172.31.255.10 --password <trust_password>
lxc remote add --accept-certificate nazwa1 172.31.255.6 --password <trust_password>
lxc remote add --accept-certificate naslxd 10.0.1.12 --password <trust_password>
lxc remote add --accept-certificate pc 10.0.1.104 --password <trust_password>
lxc remote add --accept-certificate kriszos11 192.168.111.181 --password <trust_password>

what is interesting only when adding “atl1” remote i get notification

Client certificate now trusted by server: atl1

“atl1” node is the only node that I joined to cluster on August or September this year. using token from command “lxc cluster add”. I don’t know if it is relevant.
Regardless i was able to successfully execute lxd commands on every node like bellow:

lxc cluster list atl1:
lxc cluster list nazwa1:
lxc cluster list naslxd:
lxc cluster list pc:
lxc cluster list kriszos11:
lxc config show atl1:
lxc config show nazwa1:
lxc config show naslxd:
lxc config show pc:
lxc config show kriszos11:

What is also interesting, I am now in the same situation like in my first post. I can’t move in any container from any node to one specific node “atl1”. Containers already existing on this node, and newly created can be moved out without any issues.

Maybe I should move out all my containers from this node, remove this node from cluster, reinstall lxd snap and rejoin the cluster?

Can you run lxc config trust ls

Same issue here with LXC Copy without any change. (only snap updates)
LXC Remote host readded without success. No cluster, standalone nodes with remote trust.

The command:

lxc copy CT host-2:CT-COPY

Source host IP: 192.168.1.1
Destination host IP: 192.168.1.2

The error:

Error: Failed instance creation:

  • https://publicip:8443: Error transferring instance data: Unable to connect to: publicip:8443
  • https://[ipv6]:8443: Error transferring instance data: Unable to connect to: [ipv6]:8443
  • https://192.168.1.1:8443: Error transferring instance data: websocket: bad handshake

snap list

Name Version Rev Tracking Publisher Notes
core18 20211028 2253 latest/stable canonical✓ base
core20 20211115 1242 latest/stable canonical✓ base
lxd 4.20 21902 latest/stable canonical✓ -
snapd 2.53.2 14066 latest/stable canonical✓ snapd

lxc remote list
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| NAME | URL | PROTOCOL | AUTH TYPE | PUBLIC | STATIC | GLOBAL |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| host-02 | https://192.168.1.2:8443 | lxd | tls | NO | NO | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| images | https://images.linuxcontainers.org | simplestreams | none | YES | NO | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| local (current) | unix:// | lxd | file access | NO | YES | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| ubuntu | https://cloud-images.ubuntu.com/releases | simplestreams | none | YES | YES | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| ubuntu-daily | https://cloud-images.ubuntu.com/daily | simplestreams | none | YES | YES | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+

Nothing special with lxc config trust ls. The old and new key with expiration date 2030 en 2031.

Please check the changelog related to this issue. I cant move and copy my containers now.

lxc remote remove host-2
lxc remote add host-2 192.168.1.2
Certificate fingerprint: fingerprint
ok (y/n/[fingerprint])? y

+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
|  TYPE  |                                     NAME                                     |               COMMON NAME               | FINGERPRINT  |          ISSUE DATE           |          EXPIRY DATE          |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | 10.0.1.12                                                                    | root@naslxd                             | ba590c32faf3 | Dec 2, 2021 at 4:58pm (UTC)   | Nov 30, 2031 at 4:58pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | 10.0.1.104                                                                   | kriszos@pc                              | 83a22c493584 | Nov 26, 2020 at 7:21pm (UTC)  | Nov 24, 2030 at 7:21pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | 172.20.20.34                                                                 | root@nazwa1                             | 4fed755b499b | Dec 2, 2021 at 4:53pm (UTC)   | Nov 30, 2031 at 4:53pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | 172.20.20.81                                                                 | root@nazwa1                             | 72c4e9d6b232 | Nov 12, 2020 at 1:35pm (UTC)  | Nov 10, 2030 at 1:35pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | 172.20.20.81                                                                 | root@nazwa1.kriszos.pl                  | bf514843bf20 | Aug 2, 2020 at 11:59pm (UTC)  | Jul 31, 2030 at 11:59pm (UTC) |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | 172.31.255.10                                                                | root@atl1                               | 3472858514b8 | Dec 2, 2021 at 4:57pm (UTC)   | Nov 30, 2031 at 4:57pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | 192.168.111.181                                                              | root@kriszos11                          | 060cd5fb9f5e | Dec 2, 2021 at 4:57pm (UTC)   | Nov 30, 2031 at 4:57pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | lxd.cluster.3d5a401ec032a47c5884478420ba20cbe877b63a8cf343f24c3f1d9592e081f8 | root@nazwa1                             | 3d5a401ec032 | Nov 8, 2020 at 3:56pm (UTC)   | Nov 6, 2030 at 3:56pm (UTC)   |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | lxd.cluster.5aa4cfb5baa02ea83ac61c171fab60aef588c4139884dfe3e150dd753f6f77ea | root@nazwa1                             | 5aa4cfb5baa0 | Nov 8, 2020 at 12:17am (UTC)  | Nov 6, 2030 at 12:17am (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | lxd.cluster.8cedb938ddd3799f70a318eae19dee9d6e2db5352d9434f0617b9aa56c4f25b9 | root@CT11                               | 8cedb938ddd3 | Nov 11, 2020 at 5:34pm (UTC)  | Nov 9, 2030 at 5:34pm (UTC)   |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | lxd.cluster.97fd1dff295cc9202c5d36339d49296f285107e429f897d8c1c9f682bd85a68b | root@KRISZOS11                          | 97fd1dff295c | Nov 11, 2020 at 10:36pm (UTC) | Nov 9, 2030 at 10:36pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | lxd.cluster.7193c3e42d3025e62291e3cb38d6ce8ab84654134afa1196365bbd63d3ae1bfc | root@ovh1                               | 7193c3e42d30 | Oct 16, 2020 at 11:33pm (UTC) | Oct 14, 2030 at 11:33pm (UTC) |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | lxd.cluster.32059ae454dcaddb850950869edc55a75247b97f182e7da24970a3cd489439c5 | root@ovh1                               | 32059ae454dc | Oct 18, 2020 at 3:19pm (UTC)  | Oct 16, 2030 at 3:19pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | lxd.cluster.a76005e6c86f14ab513792fb853cd495e7943352610db7f1a1cb6cafbdeb9485 | root@nazwa1                             | a76005e6c86f | Nov 8, 2020 at 4:38pm (UTC)   | Nov 6, 2030 at 4:38pm (UTC)   |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | lxd.cluster.b4bbcf8ada9bcbbf8faeed8f45b413da5568eae6c99fd8e88d5a20643ea71fe0 | root@nazwa1                             | b4bbcf8ada9b | Feb 13, 2021 at 8:55pm (UTC)  | Feb 11, 2031 at 8:55pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | lxd.cluster.cfc2eb3ee4b31d70588f46953f040bf64f1a32a25dc8a1e488773966f2f7faaa | root@nazwa1                             | cfc2eb3ee4b3 | Nov 8, 2020 at 12:43am (UTC)  | Nov 6, 2030 at 12:43am (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| client | lxd.cluster.fed6f814e01e3933202655a90c80f1d572c1612c6e9b7b173c955a22ae77f64e | root@nazwa1                             | fed6f814e01e | Nov 8, 2020 at 1:05am (UTC)   | Nov 6, 2030 at 1:05am (UTC)   |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| server | atl1                                                                         | root@atl1                               | 8ea259e54402 | Aug 17, 2021 at 10:21pm (UTC) | Aug 15, 2031 at 10:21pm (UTC) |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| server | kriszos11                                                                    | root@KRISZOS11                          | 51867ccd19ca | Oct 5, 2021 at 5:47pm (UTC)   | Oct 3, 2031 at 5:47pm (UTC)   |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| server | naslxd                                                                       | root@naslxd                             | 1edd24502197 | Aug 20, 2020 at 7:29pm (UTC)  | Aug 18, 2030 at 7:29pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| server | nazwa1                                                                       | root@nazwa1                             | 1ea5d1d59ecb | Jul 29, 2021 at 2:31pm (UTC)  | Jul 27, 2031 at 2:31pm (UTC)  |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+
| server | pc                                                                           | root@pc                                 | 70ce4eb54aef | Nov 11, 2020 at 6:51pm (UTC)  | Nov 9, 2030 at 6:51pm (UTC)   |
+--------+------------------------------------------------------------------------------+-----------------------------------------+--------------+-------------------------------+-------------------------------+

@TomvB @kriszos which version of LXD were you upgrading from and to when this started?

Can you also confirm the system time is correct on all servers.

The last change to authenticate was:

But this was in LXD 4.19.

Any ideas @stgraber ?

Time is correct on all servers. Upgrade has been done via snap from 4.19

OK, can you enable debug on the servers affected using:

sudo snap set lxd daemon.debug=true; sudo systemctl reload snap.lxd.daemon

And then capture the output when running the affected command (on both source and target servers) from: /var/snap/lxd/common/lxd/logs/lxd.log

As I have no idea what could be causing this.

1 Like

Not sure, using auto updates. Time is correct on both servers.

@TomvB @kriszos can you advise if any of the --mode options for for lxc copy, i.e it defaults to “pull”, so try “push” or “relay”.

@stgraber and I were wondering if this might be a network issue (perhaps MTU) that is interfering with websocket upgrade, as the errors suggest that TLS certificate negotiation has succeeded, but that its failing after that during websocket upgrade.

Also can you confirm if you are using the same version of the client and server (i.e 4.20)?