Import and export hanging

baleygr · July 15, 2025, 8:38am

I’m trying to export and import VMs and containers on a few different hosts (not clustered) and I’m seeing some behaviors that I’m wondering about:

On one system, an “export” command for a non-running container sometimes hangs for several hours, in one case an export process was still active a week later.
On another system, importing an exported VM from another system quite quickly reaches Importing instance: 100% but then takes hours to actually throw an error (like a missing profile, for example)
Lastly and somewhat tangentially, is there any planned work to make missing profiles or networks more user friendly? For example, if the missing profile only contains “atomic” config settings like amount of CPU cores, it would be nice to either include the profile with the export, or “flattening” the config (similar to the extended -e flag) such that the VM does not refer to the profile when imported.

Any ideas on how I can troubleshoot these hangs? Importing and exporting is quite important, not only for backups but also for migrations, so the lack of feedback from the command during export/import when nothing seems to be happening is a bit jarring.

This is on incus stable by the way (not LTS).

simos · July 15, 2025, 9:34am

Hi!

On a separate terminal window you can run the following so that you get real-time feedback as to what is happening during the Incus actions. The --pretty flag will show one record per line.

incus monitor --pretty

baleygr · July 15, 2025, 11:43am

It seems most time is spent doing this during one of my imports:

Unpacking virtual machine block volume        driver=zfs pool=default source=backup/virtual-machine.img target=/dev/zvol/nvmepool/virtual-machines/name_of_vm.block

The only issue is that it’s extremely slow compared to all the other operations, this is an zstd-3 compressed zfs dataset on an nvme drive and from watch -n1 zfs list it looks like the dataset is expanding by about 0.02G per 30 seconds or so. I really can’t understand why.

Also, I’m not sure if this is correct, but I get the feeling that exporting with --compression zstd is significantly slower than exporting without compression and then running zstd -T0 -3 vm.tar, but I haven’t timed it yet.

candlerb · July 15, 2025, 1:11pm

Could it be that the dataset is heavily compressible, e.g. lots of unused blocks which read as all zeros? If your zfs filesystem has compression enabled, which you say it has, that would mean large amounts of data written would result in much smaller growth in space usage.

A better way to monitor the throughput is to find one of the processes which is handling the compression/decompression (use ps to find a likely process such as gzip), then:

watch cat /proc/<pid>/io

where <pid> is the process ID of that process.

baleygr · July 15, 2025, 2:18pm

I’m sure that a lot of it is “empty” data, but given an export file that is itself zstd compressed being uncompressed into a zfs dataset with zstd compression, this should speed up the “empty” writing immensely. Instead, I’m seeing something like (rough estimates here) 60 minutes to import a 30GB disk image containing less than 10GB of “real” data – giving something like a 8 MB/s write speed on average.

When running incus export --compression zstd I’m seeing write speeds between 40-100MB/s, which of course is lower than “raw” nvme write speed due to the compression, and highly variable based on the incoming data, but the import (ie the write speed) being so much slower is what really makes me scratch my head.

I just tried exporting a VM with zstd compression, I then deleted the VM with incus rm and re-imported it. The export took about five minutes. The import read speed (“importing instance”) matched that of the export which makes sense. The next step where incus gives no output (when it’s “unpacking virtual machine block volume”) I checked ps aux which showed three processes:

tar --zstd -xf - --xattrs-include=* --restrict --force-local --numeric-owner -C /var/lib/incus/storage-pools/default/virtual-machines/vmname --strip-components=2 backup/virtual-machine

tar --zstd -xf - --xattrs-include=* --restrict --force-local --numeric-owner -C /var/lib/incus/storage-pools/default/virtual-machines/vmname --strip-components=2 backup/virtual-machine

zstd -d

after a while only “zstd -d” remains (with a new pid). checking cat /proc/130515/io shows both read_bytes and write_bytes at zero. I watched this for a few minutes, but only rchar, wchar, syscr and syscw were incrementing. wchar showed something like 2GB written after two minutes, which seems to match the export speed somewhat, and in this period the zfs dataset has only grown by about 300MB.

Maybe I’m missing something obvious, but I guess my assumption is that an export of a VM (zfs+zstd → tar+zstd) should take about the same time as an import (tar+zstd → zfs+zstd) when done on the exact same system, but instead the import is many orders of magnitude slower.

If I have some time to spare, I’ll try to get some actual statistics on what the different combinations give in terms of wall clock results.

candlerb · July 15, 2025, 2:54pm

That implies a ~7:1 compression ratio which is plausible. 2GB after 2 minutes is 17MB/sec which seems pretty slow though. What does “top” show? Is it CPU-bound?

baleygr · July 15, 2025, 2:59pm

CPU is no issue, it has a sysload of less than 3 on an 8-core system

candlerb · July 15, 2025, 4:09pm

sysload doesn’t really tell you anything. If it’s running slowly, it’s starved of some resource. The process might be blocked on CPU or I/O. If the process is single threaded, having extra cores won’t help.

Looking at the individual process’ CPU utilization may show you if it’s using a whole of a core. The run states are useful too; most common ones being “D” meaning blocked on disk I/O, “R” being running or runnable (hence CPU-bound).

baleygr · July 15, 2025, 6:02pm

I checked CPU utilization while it was running with btop and it wasn’t stressed, I guess it could be using one core but I thought that’s what l means in the ps output (I didn’t include this previously, sorry):

root      130293  108  0.0 156952  5864 ?        Sl   16:07   0:03 zstd -d

I guess it’s quite possible that zstd is running single core when called by incus just like xz is by default, but that still makes me confused as to why the export is then much faster, since --compression is passed to incus I would assume that should be “affected” by single core compression as well.

baleygr · July 15, 2025, 6:53pm

Well, I did some testing on images:ubuntu/22.04 as a VM, and the short story is that you should use --optimized-storage.

------------------------------
type    | export   | import  |
------------------------------
zstd    | 0m21     | 3m11    |
xz      | 7m52     | 3m52    |
none    | 0m34     | 2m4     |

So in short, zstd is actually faster than none, xz is extremely slow for some reason, and the import speed is more or less constant, except uncompressed imports are slightly faster (less overhead from decompression, I guess).

But, if you don’t specify --compression and instead go for --optimized-storage, the export time is 0m40 (similar to zstd/none), but the import time is only 0m8, way faster than any of the above.

So from this I’ve learned that --optimized-storage should always be used, unless you for some reason need to ensure backups can be imported on a completely different type of storage.

~~Why it takes about 4x time to import than it does to export (as seen with “none”) is still beyond me though~~ nevermind, obviously it’s the compression.

So I’d say this at least solves the slow import issue.

If I can figure out anything regarding the failed exports issue, I’ll post it as a separate thread.