Lxc init failed - Error adding configuration item volatile.eth0.hwaddr


I had an already running system which is now unable to create new containers.

I get:

lxc init sdk-base
Creating the container
Error: open /tmp/lxd_config_408598002: no such file or directory

Writing the lxd daemon log shows:

    ephemeral: "false"
    name: knowing-octopus
  level: info
  message: Creating container
timestamp: "2018-11-08T13:01:45.561414035+01:00"
type: logging

  context: {}
  level: dbug
  message: Error adding configuration item volatile.eth0.hwaddr = 00:16:3e:72:c2:c2
    to container 25
timestamp: "2018-11-08T13:01:45.586364211+01:00"
type: logging

  context: {}
  level: dbug
  message: 'Database error: bindings.Error{Code:2067, Message:"UNIQUE constraint failed:
    containers_config.container_id, containers_config.key"}'
timestamp: "2018-11-08T13:01:45.586506707+01:00"
type: logging

  context: {}
  level: dbug
  message: Error adding configuration item volatile.eth0.name = eth0 to container
timestamp: "2018-11-08T13:01:45.586931371+01:00"
type: logging


  class: task
  created_at: "2018-11-08T13:01:45.547178008+01:00"
  description: Creating container
  err: 'open /tmp/lxd_config_927979998: no such file or directory'
  id: 8cdaf52e-3786-446f-81b6-8894fd38866b
  may_cancel: false
  metadata: null
    - /1.0/containers/knowing-octopus
  status: Failure
  status_code: 400
  updated_at: "2018-11-08T13:01:45.547178008+01:00"
timestamp: "2018-11-08T13:01:46.05522656+01:00"
type: operation

So I checked the database, but there are no entries within containers_config causing the UNIQUE constraint to fail.

Any idea how to proceed further?



Update: This statement is wrong. I initially checked the database via lxd SQL global .dump but this command is not showing the containers_config entries. If I open a copy of the database with sqlitebrowser I see I have entries for containers within containers_config, but the containers do not exist anymore.

It seems these entries are somehow not deleted during container deletion.

If I restart the system, these entries are gone.

How can this happen?





What LXD version are you using? There has been quite some recent work in DB handling in LXD (https://github.com/lxc/lxd/tree/master/lxd/db), so the answer might depend on this.

Also, can you show a full example that demonstrates the issue with the database? You mention lxd SQL global .dump, is that a command?

The initial error about /tmp/... is usually explained by your /tmp on the host having been partly wiped, specifically the directory that’s exposed to LXD as it’s /tmp in the snap.

There isn’t really any good way to repair that kind of breakage on a running system, your best luck would be to reboot the system.

Specifically what was deleted from the host is a directory called /tmp/snap.0_lxd_XXXXXXX which is mapped to /tmp in the snap’s mount namespace. The no such file or directory error comes as a result of /tmp itself being invalid.

For the database error, could you run lxd sql global "SELECT * FROM containers_config WHERE container_id NOT IN (SELECT id FROM containers);"

That would let us check whether there’s indeed something wrong with foreign keys in your DB or if it’s something else.

Thanks for the hint with the temp-file. On the build-nodes was tmpreaper installed.
After adding /tmp/snap.* to TMPREAPER_PROTECT_EXTRA the error was gone.

Regarding the database: I had 3.0.x installed on the build nodes and upgraded to 3.7. After the upgrade, the occasional database issues seem to be gone.

Thanks a lot for the help.