Too many connections to lxd at port 8443 ( lxd 3.0.1 cluster mode)


(Druggo Yang) #1

this four node cluster with lxd 3.0.1 @ ubuntu 16.04 using fanbridge

at the time goes by, seems all the lxc command will stuck,
and I notice every node has many hundred connections to each other:

root@10.2.1.24:/# ss -nt|grep 8443|awk '{print $5}'|sort |uniq -c|sort -n|tail
      1 10.2.1.83:51218
      1 10.2.1.83:54570
      1 10.2.1.83:54676
      1 10.2.1.83:58188
      1 10.2.1.83:58248
      1 10.2.1.83:59074
      1 10.2.1.83:60648
    265 10.2.1.202:8443
    488 10.2.1.235:8443
    506 10.2.1.83:8443

any idea for such situation ? thanks.


(StΓ©phane Graber) #2

@freeekanayaka submitted a potential fix for this last week which is being rolled out to LXD 3.6 snap users, if that works, we’ll put it in the queue for LXD 3.0.3.


(Druggo Yang) #3

since I post this, lxc command stuck forever and I can’t maintain lxd cluster.

today, I need change something, so restart one lxd node, it failed with below log:

11月 06 15:59:27 dxc4 lxd[1277]: lvl=warn msg="Failed to update heartbeat: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:27+0800
11月 06 15:59:27 dxc4 lxd[1277]: lvl=warn msg="Failed to get current cluster nodes: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:27+0800
11月 06 15:59:27 dxc4 lxd[1277]: err="failed to begin transaction: sql: database is closed" lvl=eror msg="Unable to fetch cluster configuration" t=2018-11-06T15:59:27+0800
11月 06 15:59:27 dxc4 lxd[1277]: err="failed to begin transaction: sql: database is closed" lvl=eror msg="Unable to fetch cluster configuration" t=2018-11-06T15:59:27+0800
11月 06 15:59:27 dxc4 lxd[1277]: 2018/11/06 15:59:27 http: multiple response.WriteHeader calls
11月 06 15:59:27 dxc4 lxd[1277]: 2018/11/06 15:59:27 http: multiple response.WriteHeader calls
11月 06 15:59:27 dxc4 lxd[1277]: 2018/11/06 15:59:27 http: multiple response.WriteHeader calls
11月 06 15:59:28 dxc4 lxd[1277]: lvl=warn msg="Raft: Unable to get address for server id 3, using fallback address 10.2.1.202:8443: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:28+0800
11月 06 15:59:28 dxc4 lxd[1277]: lvl=warn msg="Raft: Unable to get address for server id 3, using fallback address 10.2.1.202:8443: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:28+0800
11月 06 15:59:28 dxc4 lxd[1277]: lvl=warn msg="Raft: Unable to get address for server id 1, using fallback address 0: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:28+0800
11月 06 15:59:28 dxc4 lxd[1277]: lvl=warn msg="Raft: Unable to get address for server id 2, using fallback address 10.2.1.235:8443: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:28+0800
11月 06 15:59:28 dxc4 lxd[1277]: lvl=warn msg="Raft: Unable to get address for server id 3, using fallback address 10.2.1.202:8443: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:28+0800
11月 06 15:59:28 dxc4 lxd[1277]: lvl=warn msg="Raft: Unable to get address for server id 3, using fallback address 10.2.1.202:8443: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:28+0800
11月 06 15:59:28 dxc4 lxd[1277]: lvl=warn msg="Raft: Unable to get address for server id 3, using fallback address 10.2.1.202:8443: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:28+0800
11月 06 15:59:28 dxc4 lxd[1277]: lvl=warn msg="Raft: Unable to get address for server id 3, using fallback address 10.2.1.202:8443: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:28+0800
11月 06 15:59:28 dxc4 lxd[1277]: lvl=warn msg="Raft: Unable to get address for server id 2, using fallback address 10.2.1.235:8443: failed to begin transaction: sql: database is closed" t=2018-11-06T15:59:28+0800

the other node logs

11月 06 16:18:25 dxc3 lxd[5244]: lvl=warn msg="Failed to get current cluster nodes: failed to begin transaction: cannot start a transaction within a transaction" t=2018-11-06T16:18:25+0800
11月 06 16:18:26 dxc3 lxd[5244]: lvl=warn msg="Failed to get current cluster nodes: failed to begin transaction: cannot start a transaction within a transaction" t=2018-11-06T16:18:26+0800
11月 06 16:18:27 dxc3 lxd[5244]: lvl=warn msg="Failed to get current cluster nodes: failed to begin transaction: cannot start a transaction within a transaction" t=2018-11-06T16:18:27+0800
11月 06 16:19:52 dxc3 lxd[5244]: lvl=warn msg="Failed to get current cluster nodes: failed to begin transaction: cannot start a transaction within a transaction" t=2018-11-06T16:19:52+0800
11月 06 16:29:52 dxc3 lxd[5244]: lvl=warn msg="Failed to get current cluster nodes: failed to begin transaction: cannot start a transaction within a transaction" t=2018-11-06T16:29:52+0800
11月 06 16:39:52 dxc3 lxd[5244]: lvl=warn msg="Failed to get current cluster nodes: failed to begin transaction: cannot start a transaction within a transaction" t=2018-11-06T16:39:52+0800