I have a Debian 10 container, with routed network interface, today I restarted the container and it never went up again, when I tried to delete the container and recreate it again I got the same error until I removed the IP from the routes:
route del -net 138.*.16.151 netmask 255.255.255.255
I am currently testing LXD for production, but how it is possible that this could happen ? I do not have the original container that broke after the restart unfortunately. But I would really like to know how to prevent this issue from happening in the future, because this issue is preventing me to use LXD in production.
This is how I create the container:
lxc init images:debian/buster c20
lxc config set c20 limits.cpu 1
lxc config set c20 limits.cpu.allowance 12%
lxc config set c20 limits.memory 1024MB
lxc config set c20 limits.memory.swap false
lxc config device add c20 root disk pool=default path=/
lxc config device set c20 root size 20GB
lxc config device set c20 root limits.read 10000MB
lxc config device set c20 root limits.write 10000MB
lxc start c20
After that I setup /etc/network/interfaces file inside the container:
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
address 138.*.16.151/32
I will try to, but it happened to the container that was running like 20 days without any interactions, today I tried to restart it and it could not start because of this. I create and delete all my containers via scripts so I do that always the same way.
Do you have the error logs from the container that failed to clean up the routes, rather than the errors caused when trying to start it again. As that would hopefully help shed some light on the issue.
I have just this, but I guess it it useless, that’s where I tried to restart the broken container that had assigned the IP for like 20 days or so and today failed to start:
Thanks. It occurred to me that if the static route still exists then this means that the host side of the veth pair also still exists (because if it didn’t exist then the static route would also be removed).
While liblxc should remove the container side veth interface on shutdown (which in turn should remove the host side veth interface), and indeed in my tests this is what happens, I think we could do a more thorough job of detecting if the host-side interface still exists in LXD when the NIC device is stopped and try to remove it.