Hi,
Recently setting up IncusOS as the base layer/hypervisor for my homelab and I am questioning the best approach for deploying Operations Center.
I have 9 nodes networked together over a 40GBps switch for storage communication. I watched the videos from the channel on setting up Ceph and will be doing something similar. I will also be adding them all to a single cluster.
However, I would like to have operations center to manage the nodes externally. However, I am not sure if I should set this up on a 10th node outside of the cluster. Or what would be the best way to go about including this.
Any suggestions as I transition from being an Incus noob to an incus novice?
I am in a similar situation. My current plan is to setup Operations Center in a VM somewhere else (e.g., on a laptop), and once the cluster is up and running, move it to the cluster and then have regular backups to some external location. Do not forget to save the encryption recovery key of the OC.
I hope that at some point it will be possible to register a cluster with an Operation Center, so one sparing copying the VM (with a 50GB partition required by IncusOS). For individual (non-clustered) severs one can already do that by changing the provider configuration.
I didn’t think it would be smart to run operation center as a vm in the cluster it is managing. I like the idea of running it locally on my laptop for the time being until I find a proper home. I think I will resort to using a 10th node as an “observability/management” stack and install it there.
We pretty commonly run Operations Center on top of the cluster it manages. We’ve made sure to avoid any hard dependencies at boot time between IncusOS and Operations Center for that reason.
Upcoming Operations Center features like rolling cluster reboots also account for the fact that Operations Center itself may go offline during a migration that it triggered, effectively recording its state in its database so that if it gets rebooted, it can pick things back up once it’s back online.
I just did a cluster update in Operation Center 0.5.0 with the rolling cluster reboot! What an amazing feature!
I wonder if it is possible to perform a rolling reboot, not necessarily when applying updates? I did not find how to trigger it in the UI or CLI. We need to reboot our servers periodically to apply firmware updates and rolling reboots could be very handy for keeping the services running. Being able to schedule the rolling reboot to be performed at a maintenance window would be also nice.
I was planning to put the Operation Center VM on the remote storage and OVN network so that it cal live migrate during reboots. But it is nice to hear that it designed to be resilient for reboots and network interruptions!
It’s not currently possible. If a server is in needs reboot state, then the rolling update option will appear and allow you to get the servers in needs reboot state restarted even if no update needs to be applied.
But in a situation where your servers are up to date and no reboot is pending, there’s currently no way to do this. You’d need to do it manually from the server page by evacuating the server and then rebooting it from there.
It should be pretty easy to add an endpoint that only does a rolling restart though, feel free to file a feature request at GitHub · Where software is built
Oh, I did not think about it! I wonder, if TPM can be disabled for Operation Center VM and if that would have a significant security implications for the cluster.
I now made a feature request for the on-demand rolling reboot feature:
Given Operations Center holds a TLS client certificate that has admin access to all servers it manages AND it also stores a copy of the recovery encryption key for all systems it manages. I wouldn’t really recommend downgrading the security of Operations Center