incusOS Backup: incusAutobackup

Hey there,

a few months ago I found incus and incusOS and moved nearly all parts of my homelab to incusOS. So far, I am really happy with this step, because incus is in some ways a lot easier than my previous proxmox setup. But I am missing the Proxmox-Backup server and the ability to just copy my zfs-data between different host, with zfs send and receive.

So I started to create my own little utility to tackle that problem. One thing I would like to make clear right away is that this tool is in no way comparable to a Proxmox backup server or similar solutions.

Last year, I spent more time learning about Golang, and I thought it would be nice to try to create this tools in go. Additionally, incusOS-parts and the incus API are written in go, too; which makes it very easy incorporate this.

I am posting this here, because of two reasons. First of all, I hope/think that there are some other homelabbers, which are searching for a backup solution for incus(OS) as well. And secondly, I hope I can get feedback and/or suggestions for improvement.

Now to incusAutobackup (IAB), which is inspired by zfs_autobackup, which I used before. It uses a two server setup, the source and the target server. Under the hood it just uses the incus API to perform any actions on the hosts.

The backup loop is:

  1. Create a snapshot of a volume/instance
  2. Copy the snapshot to a separate incus(OS) server
  3. Prune old snapshots based on provided retention policies

IAB works stateless, which means that only things are changed when IAB is execute. It only touches snapshots which are created by IAB itself. It is controlled by one config.json file and I run it on a incus container on my backup host, with a systemd-timer.

A hopefully comprehensive description of IAB can be found in the README.md on github.
There are still a few problems to solve, like the one that currently the source and the target server have to have the same history of snapshots (post on discuss).

I’m curious to see if there are others who could use something like IAB.

That seems to line up with an idea I’ve put to a few folks over the past few months.

Basically building a backup orchestrator which can plug into a number of source Incus servers or clusters, detect what instances need to be backed up and then push them to one or more servers or clusters on the backup side, then handling snapshots pruning and the like on that side.

So basically anyone with access to multiple Incus environments could then use that to get easy replication of instances and snapshots as backups.

Great! I would support that and would help in creating/testing a backup orchestration server (or service). If that was part of the core incus project it would prevent many of us re-creating the wheel with numerous 3rd party tools.

I have a question along these lines:

I think part of a complete backup should include a feature having the capacity to move backups to “airgapped/offline” components. There are too many sad stories of organizations setting up a great DR/Backup system but when the primary was compromised by hackers, and not detected in time, all the backups were compromised too. There are also cases where the data set was so large that moving it by physical media was better. There are also cases where you want to have many, many extended full backups, and storing them on archive media or dedicated large file server is more cost reasonable.

Would you think it reasonable for the backup orchestrator to have built into it that feature for “mount/unmount” to offline non-incus storage/servers? And if not, could we build into it a “hook” feature similar to how letsencrypt has for scripts to run after a successful execution?

That sounds good, that we have roughly the same idea of a backup solution :slight_smile:
If you want, I can also join the people you mentioned. I don’t need to build a third-party solution, I’m happy to participate in the development. IAB was created out of necessity, so that I can make backups more easily.

Possibly. Incus doesn’t really like its storage pools disappearing, but the backup orchestrator could also make use of the actual Incus backup feature for that kind of use case.

So you could have basically a data replication+retention policy that would look for instances on the monitored sources, determine a suitable destination and then handle snapshot+copy and the snapshot retention policies to keep the right number of copies.

Then separately you can also orchestrate actually Incus backup files, so creating a new backup and sending that wherever you want, you can have the orchestrator tell Incus to push it to an S3 bucket somewhere, or have it be streamed to the backup ochestrator instead where it can dump it wherever you want, be it a local drive, ftp server, …

I’ve definitely talked to interested folks over the past few months, but nobody who had any time (myself included) to actually get anything implemented, so it’s very exciting to see that you went ahead and started something!

It’s also great to see that it’s Open Source and written in Go, so fits rather well into our ecosystem.

Depending on your interests, we’d be happy to turn that into an “official” project, now or down the line, effectively hosting it at https://githu.com/lxc/incus-backup-manager or something along those lines, making it easier for folks to find and fitting with other Incus-related projects we have (Terraform provider, Kubernetes Cluster API plugin, …).

Happy to read someone started this.

I’ve been using ZnapZend for all my ZFS backups and I really like how they implemented retentions, connections, etc. They store all the details into ZFS meta data which is pretty cool, no config files to mess around with. It works pretty well but doesn’t really fit into Incus ecosystem. However, it might give some ideas what features could be added to your solution.

In my Incus environment I make a lot of use of storage volumes which contain the persistent data like config, certs, databases, etc. To perform a consistent backup it would require to snapshot both instance and persistent volumes or the other way around. Kind of a feature request I guess :wink:

Keep us posted on your progress, will definitely try it out soon.

I’m glad to hear that. As far as I’m concerned, we can turn this into an “official” incus project. Greater reach leads to more input, which I think is great. I already have a few ideas for expanding this further. My question would be, should the basic structure be changed to fit better into the Incus ecosystem, or do you have specific guidelines in this regard?

I will definitely checkout ZnapZend :+1: . If I understand you correct, this should be already possible with IAB. You can select “custom” volumes (which are in my understanding any volumes not directly attached as a root volume of a container or vm), and any kind of instance (container/vm) to be backuped.

After posting I spend some time on reading the code and it sounds like it should do the trick but not 100% certain.

For example if you have multiple instances depending on the same storage volume. In such a case it is properly best to snapshot them all before starting any copy just to be persistent.

Depending on your incus environment size performing backups will take time, at some stage running multiple (threads) would allow parallel processing.

Managing a config file is OK for now but it would be better to store the details in Incus by using user variables or similar. As IAB has access to Incus you can query them and it offers more flexibility. For example allow two targets per instance like they do in ZnapZend etc.

May be building a full daemon like Incus and run it permanent would allow to query more frequently and keep track about backups in progress compared to cron?

Think it has a lot of potential and allows to extent it to support other storage devices / APIs like S3 or real backup clients….

As with all things start small and grow over time. I’m sure @stgraber will give useful input regarding design or API usage or best practises. I like what you have started and looking forward to use and extend it.

Yes, there is definitely a time problem. Depending on the size of the backup-plan and the size of the changes; the attached volume and instance snapshot could drift.
I think it is a good idea to implement some option thats allows to first take all snapshots (maybe with stopping the container+volume pair) and than do the “slow” copy part.

Multiple connections is something else to take into account, but in my case it was limited by network speed, so it was not one of my priorities.

And as well I totally agree, to get ride of a config file, which needs to be maintained, but it is the simpler option for now. Just like the: multi target option, different storage devices, …

Thanks for your input and kind words :slight_smile:

I think it’s a good idea to just kick the tires with your current approach, get something that technically works for you, then we can go for a more complex v2 that tries to more generically support a bunch of other folks’ needs.

In my mind, the way I’d build something like this ultimately would be:

  • Run it as a daemon with support for both a local (unix socket) and remote (HTTPS) REST API
  • Have a few main concepts
    • Sources (individual Incus servers or clusters)
    • Targets (individual Incus servers or clusters, could be some of the same as in sources if exchanging backups between environments)
    • Rules (basically a list of backup rules, providing filters on what should be backed up, from where and how often. That’s where the bulk of the complexity will be as it may need to handle a bunch of different sources, targets, projects, filter based on instance properties, scheduling, data retention, …)
  • I’d probably just do a basic sqlite database for the state and keep things pretty well aligned with the API objects
  • On top of scheduling, we’d also want a bunch of actions to manually trigger backups, manually expire/delete them and of course allow for restoration as well
  • Put quite a bit of focus on the user experience, especially on reporting clear and easy status on everything that’s been or is being backed up so folks can easily glance at it and be confident that backups are happening correctly

A bunch of the above is how we’ve setup Migration Manager, which has a fair bit in common conceptually even if it focused on replicating VMware instances into Incus rather than replicating Incus to Incus :slight_smile:

But again, that’s how my brain works and how I’d setup a project like that, in this case, you’re the one who’s in charge, so do what makes sense for you!

(@masnax you may find this thread interesting ^)

Thank you for your input! I continue the work on IAB and we will see how far we can get. I will take a closer look to Migration Manger to get some ideas.

Do you have a tip where I can search in incusOS/incus to solve this Issue? Or maybe can anyone else confirm this? I want to be sure, that this is not only a strange me problem.