Error: Error transferring instance data: Failed getting instance storage pool name: Instance storage pool not found

No error when i stop the source container :slight_smile:

Do you have snap set lxd criu.enable=true out of interest? (it defaults to off).

It is enabled on source and target.

Do you see the lxd pid on the target change after the migration fails (indicating LXD crashed and was restarted)?

Interesting.

Can you disable it and try with it running again?

Ah yes thats the issue:

sudo snap set lxd criu.enable=true; sudo systemctl reload snap.lxd.daemon
lxc copy --mode push --refresh --stateless  --config boot.autostart=false c1 v2:c2
Error: Failed instance migration: Failed reading migration header: websocket: close 1006 (abnormal closure): unexpected EOF
snap set lxd criu.enable=false;
error: cannot perform the following tasks:
- Run configure hook of "lxd" snap (snap "lxd" option "criu" is not a map)

I can execute snap set lxd criu=false but it doesn’t has an effect to the websocket: close 1006 error.

On both sides

Did it on both sides with a reload. But the command you posted does not work see last answer

Oh it seems to persistently break once criu has been enabled once even after being disabled again.

I’m not able to disable it. The command throws an error. After searching for this error i found my own report from last year Run configure hook of "lxd" snap (snap "lxd" option "criu" is not a map)

OK so first thing I need to figure out is whether

sudo snap set lxd criu.enable=false; sudo systemctl reload snap.lxd.daemon

or

sudo snap unset lxd criu.enable; sudo systemctl reload snap.lxd.daemon

is actually working, and if not why not.
If it is, then need to figure out what its not working with CRIU enabled.

BTW, I suspect CRIU was never working properly before as it doesn’t work with containers that have networking AFAIK.

I wonder if your snapd version is out of date (as you’re running Debian on the host?).

Anyway, I would suggest stearing clear of CRIU as its not well supported.
There’s still a clear bug here but in general even without the bugs I wouldn’t expect CRIU to work in your situation.

OK, after executing snap unset lxd criu i was able to set snap set lxd criu.enable=false.

Tried both, with set lxd criu.enable=false and unset lxd criu.enable on source/target with reload. The socket error persists.

I never used criu. I had a try last year and decided that it is not stable enough for our env and then forgot about it.

Debian 11.3 / Snap 2.55.5

I’ve reproduced it without CRIU and have logged this issue for you:

1 Like

Remember to remove the lxd.debug file and reload otherwise you’ll not get automatic updates to LXD in the future.

Very nice, the error is gone without --mode push. I’ve experimented with push/pull prior 5.x because --refresh had really bad performance. Now with 5.x --refresh migrations are lightning fast. Thank you @tomp for solving this issue.

1 Like

The actual transfers (in theory) are exactly the same with --pull (the default) or --push.
The only difference is how the connection is established. With --pull the target connects back to the source and with --push its the other way round.

What you may have observed as slowness earlier was the delay in establishing a connection when using pull mode, as LXC would try various different IP combinations to reach the source server from the target. Whereas --push mode may have connected quicker.

However there is clearly some issue that is making --push behave differently now, which shouldn’t be there.

Glad using --pull is working for you though :slight_smile: