Lxc copy --refresh

Quick question @stgraber, you mention here, that

“…subsequent refreshes will then compare the list of snapshots, delete any snapshot which was removed from the source or which appears to have been changed and then sync the missing snapshots and container state using rsync.”

Does this mean, that there is no incremental aspect? Just snapshots are checked and the container itself is not compared?

I use the DIR backend and of course it would be cool to only load over the changes, especially for that backend and the purpose to have a backup. And saving network bandwidth when copying to a remote.


rsync is usually clever enough not to sync things it already has

1 Like

So it needs that long to compare the changes when I run the command
lxc copy container backup --refresh
on a container that has not changed at all and it takes a minunte?

That’s possible.

It may not need to write anything but it will still need to look at every single file in the source and target, look at their last modification date and file size to try to figure out if they’ve changed or not.

1 Like

One reason for this slow down is also is that our rsync.LocalCopy() function uses the --checksum argument.

From man rsync:

This changes the way rsync checks if the files have been changed and are in need of a transfer. Without this option, rsync uses a “quick check” that (by default) checks if each file’s size and time of last modification match between the sender and receiver. This option changes this to compare a 128-bit checksum for each file that has a matching size. Generating the checksums means that both sides will expend a lot of disk I/O reading all the data in the files in the transfer (and this is prior to any reading that will be done to transfer changed files), so this can slow things down significantly.

Whereas our remote copy function does not use this option from what I can tell:

@stgraber should the remote side do checksum mode too? If not I suppose we should add a comment there explaining the difference.

1 Like

I think our remote copy logic predates the ability to refresh existing data so never really needed it. Adding it would make sense but as I suspect we need to pass it on both sides, we’ll need it to be a negotiated option similar to compression, delete, … (still very easy to add)

1 Like

So now I will have to use ZFS after all to make backups possible without downtime. Bummer…