I have a 03 nodes cluster with following parameters set:
cluster.healing_threshold=10
When I reboot the node where a VM is running, the node is evacuated, however the VM instance is stopped and not starting on remaining cluster nodes.
Not sure if I am missing something?
Can you show a incus config show --expanded NAME for the instance?
Automatic recovery requires that the instance disk, network and all attached devices be available across the entire cluster. So if it’s using any local resource, it won’t come back up elsewhere.
I suspect the nictype=bridged is the issue, this isn’t an Incus managed network so Incus doesn’t know that this will be available on all servers, therefore making the instance unsuitable for automatic relocation during evacuation and during failure.
You could try setting cluster.evacuate to live-migrate which would override this behavior at least during normal incus cluster evacuate runs and which may also apply to the automatic recovery code path (which would then just turn into a regular migrate).