Error: sql: Scan error on column index 0, name "node_id": converting NULL to int is unsupported

Hi,
I power on my host machine today and all my containers now in STOPPED state. So I searched the problem a little bit deeper but I get the following error when I execute the “lxc storage volume list lxd-pool” command. Can someone assist me about the problem?

Error: sql: Scan error on column index 0, name “node_id”: converting NULL to int is unsupported

Here are some details about lxc/lxd configuration:

lxc --version
4.5
indiana@mars:~$ lxd sql global "SELECT * FROM storage_pools;"
+----+----------+--------+--------------+-------+
| id |   name   | driver | description  | state |
+----+----------+--------+--------------+-------+
| 5  | default  | ceph   | Ceph Storage | 1     |
| 11 | lxd-pool | zfs    | ZFS Storage  | 1     |
+----+----------+--------+--------------+-------+
indiana@mars:~$ lxc list
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
|     NAME      |  STATE  |        IPV4         |                     IPV6                      |   TYPE    | SNAPSHOTS |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| alpine312-zfs | STOPPED |                     |                                               | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| c1-7          | STOPPED |                     |                                               | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| centos7-zfs   | STOPPED |                     |                                               | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| fedora32-zfs  | STOPPED |                     |                                               | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| test          | RUNNING | 10.35.95.133 (eth0) | fd42:c787:d33a:ae82:216:3eff:fefe:c0a5 (eth0) | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| u1            | STOPPED |                     |                                               | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
indiana@mars:~$ lxc start c1-7
Error: Common start logic: No such object
Try `lxc info --show-log c1-7` for more info

Ok, so to confirm, this is a cluster setup with ceph remote storage and you’re running 4.5?

Can you show:

  • lxd sql global "SELECT * FROM storage_volumes;"
  • lxd sql global "SELECT * FROM nodes;"

Hi, @stgraber.
No it is not a cluster setup, just two storage systems are defined, zfs and ceph. But ceph storage is not running exactly. Briefly Two days ago this system is up and running, but now it is not responding.
Here are the outputs:

indiana@mars:~$ lxd sql global "SELECT * FROM storage_volumes;"
+-----+------------------------------------------------------------------+-----------------+---------+------+-------------+------------+--------------+
| id  |                               name                               | storage_pool_id | node_id | type | description | project_id | content_type |
+-----+------------------------------------------------------------------+-----------------+---------+------+-------------+------------+--------------+
| 25  | centos8-zfs                                                      | 11              | <nil>   | 0    |             | 1          | 0            |
| 28  | centos7-zfs                                                      | 11              | <nil>   | 0    |             | 1          | 0            |
| 38  | alpine312-zfs                                                    | 11              | <nil>   | 0    |             | 1          | 0            |
| 91  | fedora32-zfs                                                     | 11              | <nil>   | 0    |             | 1          | 0            |
| 95  | c09302b8149908cc7a644ebae91032d72f542a1554acebc9b1ae409aa23a5017 | 11              | <nil>   | 1    |             | 1          | 0            |
| 106 | u1                                                               | 11              | <nil>   | 0    |             | 1          | 0            |
| 107 | a92eaa65a5c5e53c6bf788b4443f4e5d2afac1665486247c336aa90959522bb6 | 11              | <nil>   | 1    |             | 1          | 0            |
| 116 | a2fe9ac5c8af831dda579a6dc9a83acc1a4193e46ba1e2407fc99f5fd6db8bd0 | 5               | <nil>   | 1    |             | 1          | 0            |
| 117 | c1-7                                                             | 11              | <nil>   | 0    |             | 1          | 0            |
| 121 | f603184f60a0f9cfe6641b33596edcb27e7852e6795cbd3cc06cfc3fdd647512 | 11              | <nil>   | 1    |             | 1          | 0            |
| 122 | 66567b32341b2b382399af0531bdeb2d537b305cb0725ede2ab8291f3830105f | 11              | <nil>   | 1    |             | 1          | 0            |
| 123 | f6d9407e26ce18aaec845d120a5decc7c86113798854f0358774c4e22868902f | 11              | <nil>   | 1    |             | 1          | 0            |
| 124 | cc2693832478ccd5071d18ca1b3a6fe5cc1837c2d7d2e90d479f354b95fcf367 | 11              | <nil>   | 1    |             | 1          | 0            |
| 125 | d5f778088f05ea04e109fa0c232a1c318746b630623e2d5ff8d8f9c1482007a6 | 11              | <nil>   | 1    |             | 1          | 0            |
| 126 | test                                                             | 11              | 1       | 0    |             | 1          | 0            |
| 127 | f603184f60a0f9cfe6641b33596edcb27e7852e6795cbd3cc06cfc3fdd647512 | 11              | 1       | 1    |             | 1          | 0            |
+-----+------------------------------------------------------------------+-----------------+---------+------+-------------+------------+--------------+
indiana@mars:~$ lxd sql global "SELECT * FROM nodes;"
+----+------+-------------+---------+--------+----------------+---------------------------+---------+------+-------------------+
| id | name | description | address | schema | api_extensions |         heartbeat         | pending | arch | failure_domain_id |
+----+------+-------------+---------+--------+----------------+---------------------------+---------+------+-------------------+
| 1  | none |             | 0.0.0.0 | 35     | 203            | 2020-07-05T11:58:59+03:00 | 0       | 2    | <nil>             |
+----+------+-------------+---------+--------+----------------+---------------------------+---------+------+-------------------+

I have just created a test container which name is test, for health of the lxd system, and it works. But the existing ones, for example c1-7 is not started.

indiana@mars:~$ lxc list
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
|     NAME      |  STATE  |        IPV4         |                     IPV6                      |   TYPE    | SNAPSHOTS |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| alpine312-zfs | STOPPED |                     |                                               | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| c1-7          | STOPPED |                     |                                               | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| centos7-zfs   | STOPPED |                     |                                               | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| fedora32-zfs  | STOPPED |                     |                                               | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| test          | RUNNING | 10.35.95.133 (eth0) | fd42:c787:d33a:ae82:216:3eff:fefe:c0a5 (eth0) | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| u1            | STOPPED |                     |                                               | CONTAINER | 0         |
+---------------+---------+---------------------+-----------------------------------------------+-----------+-----------+

@monstermunchkin don’t suppose you’re around and have an idea of what’s going on here?

@cemzafer could you make a tarball of /var/snap/lxd/common/lxd/database and e-mail that to me at stgraber_at_ubuntu_dot_com?

Once you’ve done that, I think you can fix your setup with:

  • lxd sql global "UPDATE storage_volumes SET node_id=1 WHERE storage_pool_id=11;"

The tarball should help me have an easy way to reproduce the upgrade and see what happened to cause this issue.

@stgraber, I can not send the database via gmail because of security reasons. Is there any another method to post that database?
Thanks.

Nevermind, I post the file but I have to change the suffix. For your information, it is a gzip tar file.
Thanks.

Got it, thanks.

Did you try the fix? Does that get you back to working order?

I get the following error.

indiana@mars:~$ lxd sql global "UPDATE storage_volumes SET node_id=1 WHERE storage_pool_id=11;"
Error: Failed to exec query: UNIQUE constraint failed: storage_volumes.storage_pool_id, storage_volumes.node_id, storage_volumes.project_id, storage_volumes.name, storage_volumes.type

Yeah, this is definitely a bug. I was able to reproduce this by creating a zfs pool and a ceph pool. When the database is updated, it will incorrectly set the node ID of any existing storage volumes to nil. I’m working on a fix.

Thanks @monstermunchkin, nice to hear that is a bug.
Regards.

So, here is the fix: https://github.com/lxc/lxd/pull/7829

You should be able to get rid of the error by calling

  • lxd sql global "DELETE FROM storage_volumes WHERE id=127;" (deletes the latest db entry)
  • lxd sql global "UPDATE storage_volumes SET node_id=1 WHERE storage_pool_id=11;" (fixes the incorrect entries)

Thanks, I replied the posted commands and everything runs smoothly.
Regards.