Database error: "sql: transaction has already been committed or rolled back"

Yes, I’m well aware of that one, happens every few months on various boxes. Will sing to the heavens and buy a round for the developers the day that one is figured out.

Any insight in to the Database errors?

Error: Failed to fetch from "config" table: sql: transaction has already been committed or rolled back

Also saw the same message with reference to the profiles and instances_profiles table.

There is a 10s timeout on each transaction:

The error you are getting looks to mean that its been 10s waiting to start the transaction, which suggests contention on the database or I/O.

Focusing on the DB part of the thread and assuming we are getting the errors due to the same reason:

Since a clean 5.3 does not exhibit these issues, I would look into what is supposed to migrate or change when updating from 5.2.

Regarding I/O contention, I have confirmed the issue occurs on a system sitting at 99,9% idle, no IOwait, with about 300000 IOPS at LXD’s disposal. I would aim for LXD to work smoothly even on a 100 IOPS HDD-backed host when under load, though. But in this case, I think it is something else.

One of the errors I saw was Rows are closed. Could that indicate the new refactored code can cause a race condition where some transactions fail - potentially by being too fast?

1 Like

Can you enable debug logging whilst on lxd 5.2 and then initiate a refresh to lxd 5.3 and once the errors start happening then provide the full log?

1 Like

The rows closed is likely because of the timeout kicking in.

sudo snap set lxd daemon.debug=true; sudo systemctl reload snap.lxd.daemon

Then the contents of /var/snap/lxd/common/lxd/logs/lxd.log please

Also please can you confirm host os and kernel version .

Are you still seeing the first container start and then others not on a fresh install? And are you still seeing a correlation between kernel versions and the issue?

Are you able to run a test without luks in the mix to rule that out as a contributor?

On an LXD install that is quite different from our main setup, I am not seeing this issue. It is used for CI and has the following key differences:

  • /var/snap/lxd on tmpfs (i.e. “ramdisk”)
  • LXD re-inits every boot (but uptimes can be quite long still)
  • No LUKS
  • There were most likely no containers when LXD switched from 5.2 to 5.3
  • At most there is normally only one container
  • Nested, i.e. this LXD runs in an LXD container

I will have to set up separate test servers for these procedures, but can’t do that today. I will track this thread to see if it is still needed when I get the time. Then I can also test with/without LUKS and other aspects to bisect.

In the meantime, I pinned our hosts to 5.2 so the matter is not pressing on our part.

@tomp Carrying on conversation from other thread:

Host OS version is all 20.04, kernel versions:

> for server in $servers; do echo $server; ssh $server uname -a;  echo; done
es-hel-phys-2
Linux es-hel-phys-2 5.4.0-77-generic #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

app-hel-phys-3
Linux app-hel-phys-3 5.4.0-77-generic #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

app-hel-phys-8
Linux app-hel-phys-8 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

es-hel-phys-3
Linux es-hel-phys-3 5.4.0-77-generic #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

app-hel-phys-2
Linux app-hel-phys-2 5.4.0-77-generic #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

app-hel-phys-5
Linux app-hel-phys-5 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

es-hel-phys-1
Linux es-hel-phys-1 5.4.0-77-generic #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

app-hel-phys-4
Linux app-hel-phys-4 5.4.0-77-generic #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

app-hel-phys-7
Linux app-hel-phys-7 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

hetzner-inference-5-phys
Linux hetzner-inference-5-phys 5.11.0-40-generic #44~20.04.2-Ubuntu SMP Tue Oct 26 18:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

hetzner-inference-8-phys
Linux hetzner-inference-8-phys 5.11.0-40-generic #44~20.04.2-Ubuntu SMP Tue Oct 26 18:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

hetzner-inference-9-phys
Linux hetzner-inference-9-phys 5.11.0-40-generic #44~20.04.2-Ubuntu SMP Tue Oct 26 18:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

hetzner-inference-staging-phys
Linux hetzner-inference-staging-phys 5.11.0-40-generic #44~20.04.2-Ubuntu SMP Tue Oct 26 18:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

hetzner-inference-2-phys
Linux hetzner-inference-2-phys 5.4.0-96-generic #109-Ubuntu SMP Wed Jan 12 16:49:16 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

hetzner-inference-3-phys
Linux hetzner-inference-3-phys 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

hetzner-inference-4-phys
Linux hetzner-inference-4-phys 5.4.0-96-generic #109-Ubuntu SMP Wed Jan 12 16:49:16 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

hetzner-inference-7-phys
Linux hetzner-inference-7-phys 5.11.0-40-generic #44~20.04.2-Ubuntu SMP Tue Oct 26 18:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

hetzner-inference-6-phys
Linux hetzner-inference-6-phys 5.11.0-40-generic #44~20.04.2-Ubuntu SMP Tue Oct 26 18:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

monitoring
Linux monitoring 5.15.0-27-generic #28-Ubuntu SMP Thu Apr 14 04:55:28 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

app-hel-phys-6
Linux app-hel-phys-6 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

hetzner-inference-1-phys
Linux hetzner-inference-1-phys 5.11.0-40-generic #44~20.04.2-Ubuntu SMP Tue Oct 26 18:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

app-ovh-phys-1
Linux app-ovh-phys-1 5.4.0-117-generic #132-Ubuntu SMP Thu Jun 2 00:39:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

app-ovh-phys-2
Linux app-ovh-phys-2 5.4.0-117-generic #132-Ubuntu SMP Thu Jun 2 00:39:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

app-ovh-phys-3
Linux app-ovh-phys-3 5.4.0-117-generic #132-Ubuntu SMP Thu Jun 2 00:39:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

app-ovh-phys-4
Linux app-ovh-phys-4 5.4.0-117-generic #132-Ubuntu SMP Thu Jun 2 00:39:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

app-ovh-phys-5
Linux app-ovh-phys-5 5.4.0-117-generic #132-Ubuntu SMP Thu Jun 2 00:39:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

app-ovh-phys-6
Linux app-ovh-phys-6 5.4.0-117-generic #132-Ubuntu SMP Thu Jun 2 00:39:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

app-staging-phys-1
Linux app-staging-phys-1 5.13.0-41-generic #46~20.04.1-Ubuntu SMP Wed Apr 20 13:16:21 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

app-ovh-phys-7
Linux app-ovh-phys-7 5.4.0-121-generic #137-Ubuntu SMP Wed Jun 15 13:33:07 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

app-ovh-phys-8
Linux app-ovh-phys-8 5.4.0-121-generic #137-Ubuntu SMP Wed Jun 15 13:33:07 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

es-ovh-phys-1
Linux es-ovh-phys-1 5.4.0-121-generic #137-Ubuntu SMP Wed Jun 15 13:33:07 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

es-ovh-phys-2
Linux es-ovh-phys-2 5.4.0-121-generic #137-Ubuntu SMP Wed Jun 15 13:33:07 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

es-ovh-phys-3
Linux es-ovh-phys-3 5.4.0-121-generic #137-Ubuntu SMP Wed Jun 15 13:33:07 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

app-hel-phys-1
Linux app-hel-phys-1 5.4.0-77-generic #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Do you want log files from all the hosts, or just the database hosts?

Thanks, we could really do with the debug logs from the affected systems.

Myself and @masnax have tried reproducing this error to no avail.

I just tried a freshly installed ubuntu 20.04 system (kernel 5.4.0-121-generic), with a ZFS pool ontop of LVM logical volume (for added layering), then installed the 5.2/stable LXD channel, configured LXD to use the logical volume as the block device of the ZFS pool, and then launched 5 ubuntu/focal containers.

I tried refreshing several times between LXD 5.2 and LXD 5.3 to try and reproduce it, but no luck.

With them still running, I then refreshed to latest/stable which installed 5.3-924be6a and ran lxc ls and the container list with IPs and status returned quickly as normal.

Can I also ask if anyone here who is affected by this are they using disk devices on their containers to pass in paths from the host into the container?

In our case the errors were observed on hosts with containers using disk devices and on hosts without any such container.

Its a strange one. There’s clearly something you all have in common, but can’t figure out what it is yet.
I’ve just done a LXD 5.2 to LXD 5.3 cluster upgrade too and that went fine.

I am, yes

Thanks for this.

It looks like app-hel-phys-4 is leader, can you get the output of lxc cluster ls from that member to confirm?

Also can you double check that all members are online and running the same LXD version using snap info lxd
as I saw an instance of this:

time="2022-06-30T14:01:37Z" level=warning msg="Could not notify all nodes of database upgrade" err="failed to notify peer app-hel-phys-2:8443: failed to notify node about completed upgrade: Patch \"https://app-hel-phys-2:8443/internal/database\": Unable to connect to: app-hel-phys-2:8443 ([dial tcp 10.145.8.216:8443: connect: connection refused])"

Looks like there could be some DNS issues too:

time="2022-07-01T13:30:51Z" level=warning msg="Failed adding member event listener client" err="lookup es-ovh-phys-3 on 213.186.33.99:53: no such host" local="10.145.96.163:8443" remote="es-ovh-phys-3:8443"
Output
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
|              NAME              |                     URL                     |      ROLES       | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATE  |      MESSAGE      |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-hel-phys-1                 | https://app-hel-phys-1:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-hel-phys-2                 | https://app-hel-phys-2:8443                 | database-standby | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-hel-phys-3                 | https://app-hel-phys-3:8443                 | database-leader  | x86_64       | default        |             | ONLINE | Fully operational |
|                                |                                             | database         |              |                |             |        |                   |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-hel-phys-4                 | https://app-hel-phys-4:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-hel-phys-5                 | https://app-hel-phys-5:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-hel-phys-6                 | https://app-hel-phys-6:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-hel-phys-7                 | https://app-hel-phys-7:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-hel-phys-8                 | https://app-hel-phys-8:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-ovh-phys-1                 | https://app-ovh-phys-1:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-ovh-phys-2                 | https://app-ovh-phys-2:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-ovh-phys-3                 | https://app-ovh-phys-3:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-ovh-phys-4                 | https://app-ovh-phys-4:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-ovh-phys-5                 | https://app-ovh-phys-5:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-ovh-phys-6                 | https://app-ovh-phys-6:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-ovh-phys-7                 | https://app-ovh-phys-7:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| app-ovh-phys-8                 | https://app-ovh-phys-8:8443                 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| es-hel-phys-1                  | https://es-hel-phys-1:8443                  |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| es-hel-phys-2                  | https://es-hel-phys-2:8443                  | database-standby | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| es-hel-phys-3                  | https://es-hel-phys-3:8443                  | database-standby | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| es-ovh-phys-2                  | https://es-ovh-phys-2:8443                  |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| es-ovh-phys-3                  | https://es-ovh-phys-3:8443                  |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| hetzner-inference-1-phys       | https://hetzner-inference-1-phys:8443       |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| hetzner-inference-2-phys       | https://hetzner-inference-2-phys:8443       |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| hetzner-inference-3-phys       | https://hetzner-inference-3-phys:8443       |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| hetzner-inference-4-phys       | https://hetzner-inference-4-phys:8443       | database         | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| hetzner-inference-5-phys       | https://hetzner-inference-5-phys:8443       |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| hetzner-inference-6-phys       | https://hetzner-inference-6-phys:8443       | database-standby | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| hetzner-inference-7-phys       | https://hetzner-inference-7-phys:8443       |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| hetzner-inference-8-phys       | https://hetzner-inference-8-phys:8443       |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| hetzner-inference-9-phys       | https://hetzner-inference-9-phys:8443       | database         | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| hetzner-inference-staging-phys | https://hetzner-inference-staging-phys:8443 | database-standby | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| monitoring                     | https://monitoring:8443                     |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------------------------------+---------------------------------------------+------------------+--------------+----------------+-------------+--------+-------------------+

Yep they are

That’s odd. Just tested from another host, and it works fine

> nc -v app-hel-phys-2 8443
Connection to app-hel-phys-2 8443 port [tcp/*] succeeded!