Project | LXD |
Status | Implemented |
Author(s) | @monstermunchkin |
Approver(s) | @stgraber @tomp |
Release | 5.14 |
Internal ID | LX040 |
Abstract
This adds automated cluster member evacuation which migrates remote-backed instances if a cluster member is offline for a certain amount of time.
Rationale
Currently, offline cluster members are not automatically evacuated. This however would be beneficial if the offline cluster member has remote-backed instances. These can be migrated even if the member is offline.
Specification
Design
The following new configuration will be added:
cluster.healing_threshold
The automated cluster member evacuation can be enabled by setting the configuration key to a positive integer. If this value is lower than cluster.offline_threshold
, that value will be used internally instead. This value represents the time is seconds after which an offline cluster member may be evacuated automatically.
If enabled, the cluster leader checks for offline members every minute, and evacuates those members if needed. Remote-backed instances are then migrated, and local instances are ignored as those cannot be migrated.
Once the offline member comes back online, it won’t be restored automatically. This needs to be done manually.
API changes
No API changes.
CLI changes
No CLI changes.
Database changes
No database changes.
Upgrade handling
No upgrade handling.