I have a problem where
lxc exec gives the error
Failed to retrieve PID of executing child process
After setting debug
snap set lxd daemon.debug=true
snap set lxd daemon.verbose=true
systemctl reload snap.lxd.daemon
And trying again I see in /var/snap/lxd/common/lxd/logs/forkexec.log
Failed to load config file /var/snap/lxd/common/lxd/logs/ams-ceqsttroh003taa8cu10/lxc.conf for /var/snap/lxd/common/lxd/containers/ams-ceqsttroh003taa8cu10
Indeed it appears that lxd.conf is missing at this location for all the containers on this node and the only file present for the other containers is lxc.log
From snap list:
lxd 4.0.9-a29c6f1 24065 5.0/stable canonical** disabled,in-cohort
lxd 5.0.1-9dcf35b 23545 5.0/stable canonical** in-cohort
Any thoughts on how this came to be?
Sort of. We aren’t in
Does rebooting the host fix it for some time and then it happens again?
Restarting snap.lxd.daemon appears to restore exec (at least the 1 container we exec’d has its files back) but this also restarts containers which isn’t an option for the user.
Is it possible to prompt LXD to recreate those files? It is at least possible with a service and container restart.
Hey @tomp thanks for the assist so far.
So it looks like that link indicates a full restart of LXD and the containers is required.
Trying to avoid this I tried this based in info documented at 
Send SIGQUIT to the lxd daemon
sudo kill -QUIT $(pidof -s lxd)
Start the lxd daemon again.
sudo systemctl start snap.lxd.daemon
According to  this should run the startup sequence again where directory structures are checked.
Having tried this, it seems to have a similar effect except now the error message on exec is:
“Error: Instance not found”
Even though the container is listed on the host as running.
Thoughts on this?
You can cleanly reload LXD using:
sudo systemctl reload snap.lxd.daemon
Rather than uncleanly killing it (which may cause inconsistency or DB data loss).
lxc list and
lxc project list output.