I’ve had this issue across several containers, with different loads, across different hosts. They are all LXD 5.13, mostly on Ubuntu 22.04 and one on Debian 11. They all use the DIR driver, and the host is on LVM.
The container loses network connectivity with the outside, and after a while it loses its IP (presumably when DHCP renewal comes along and it doesn’t work). Operations from the outside (enter the CT, or stopping it, also do not work anymore). As soon as I force stop the CT and restart it, everything is fine.
I will keep adding details to this post, but here are a couple of instances where the container became unreachable
to protect sensitive information i’m sending you the logs via a DM.
the ct which hung (ct104-mon) does NOT have a heavy workload. it’s steady and minor, but it is active. the ct with the high workload is ct103. pbs is idle and the other two are medium workload.
Sent you a new set of log files today. The difference between this one and the previous ones was that this time I caught it soon after the CT hang and before it loses it’s IP. There might be something helpful there
Pls lmk if there are other logs I should collect of cmds i should run.