I’m copying a container from a remote to a local with this command
lxc copy --stateless --refresh remote:c1 c1
The command runs fine but then the container won’t start.
If I delete the container on the local host and just run
lxc copy --stateless remote:c1 c1
Then everything works fine but I’d like to include --refresh to get some incremental copies going…
lxc info --show-log c1 shows the following, the key line seems to be
lxc c1 20190329110109.406 ERROR dir - storage/dir.c:dir_mount:198 - No such file or directory - Failed to mount "/var/snap/lxd/common/lxd/containers/c1/rootfs" on "/var/snap/lxd/common/lxc/"
here’s the whole output
lxc c1 20190329110109.388 WARN conf - conf.c:lxc_map_ids:2970 - newuidmap binary is missing
lxc c1 20190329110109.388 WARN conf - conf.c:lxc_map_ids:2976 - newgidmap binary is missing
lxc c1 20190329110109.394 WARN conf - conf.c:lxc_map_ids:2970 - newuidmap binary is missing
lxc c1 20190329110109.394 WARN conf - conf.c:lxc_map_ids:2976 - newgidmap binary is missing
lxc c1 20190329110109.406 ERROR dir - storage/dir.c:dir_mount:198 - No such file or directory - Failed to mount "/var/snap/lxd/common/lxd/containers/c1/rootfs" on "/var/snap/lxd/common/lxc/"
lxc c1 20190329110109.406 ERROR conf - conf.c:lxc_mount_rootfs:1351 - Failed to mount rootfs "/var/snap/lxd/common/lxd/containers/c1/rootfs" onto "/var/snap/lxd/common/lxc/" with options "(null)"
lxc c1 20190329110109.406 ERROR conf - conf.c:lxc_setup_rootfs_prepare_root:3498 - Failed to setup rootfs for
lxc c1 20190329110109.406 ERROR conf - conf.c:lxc_setup:3551 - Failed to setup rootfs
lxc c1 20190329110109.406 ERROR start - start.c:do_start:1282 - Failed to setup container "c1"
lxc c1 20190329110109.406 ERROR sync - sync.c:__sync_wait:62 - An error occurred in another process (expected sequence number 5)
lxc c1 20190329110109.406 WARN network - network.c:lxc_delete_network_priv:2589 - Operation not permitted - Failed to remove interface "eth0" with index 30
lxc c1 20190329110109.406 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:864 - Received container state "ABORTING" instead of "RUNNING"
lxc c1 20190329110109.407 ERROR start - start.c:__lxc_start:1975 - Failed to spawn container "c1"
lxc c1 20190329110109.408 WARN conf - conf.c:lxc_map_ids:2970 - newuidmap binary is missing
lxc c1 20190329110109.408 WARN conf - conf.c:lxc_map_ids:2976 - newgidmap binary is missing
lxc 20190329110109.414 WARN commands - commands.c:lxc_cmd_rsp_recv:132 - Connection reset by peer - Failed to receive response for command "get_state"
Ah, so I wonder if the difficultly here is about the storage pool being called differently on source and target, in a normal one time migration, this gets re-shuffled a bit for you, but when running refresh, the config is re-synced and something may be going wrong there.
Any chance you can try in a setup where the storage pools have the same names on both end? That’d confirm the hypothesis and would make the issue easier to reproduce.
Not having any luck reproducing this issue with similarly differently named storage pools, things work just fine here.
Hmm… the host pool is three-way-mirror and the remote is one-way. I guess you’re saying that having different names shouldn’t be a problem?
Do you want me to try when both storage pools have the same name?
What storage backends are in use on source and destination?
They’re both zfs but zfs was setup before lxd was initialised. Both zfs configurations should be the same, aside from the fact that three-way-mirror has 3 disks in it and one-way has just one…
FYI - I was able to test this. When I used zpools of the same name (source, tarrget), the container copied successfully with the --refresh option, and it was seperately refreshed (updated) successfully (with minor changes I made deliberately). I.e. it works as advertised.
HOWEVER: when I tried on the same setup using a different zpool, it failed to even copy at all using --refresh with the following error:
Error: Invalid devices: Device validation failed “root”: The “default” storage pool doesn’t exist
So I have a new rule for me based on this lesson: use the same name when creating zpools on servers that you want to use for lxc container copies. I have never made that a habit before now, but I will going forward.
THANK YOU. (And thank you for the --refresh option!!!)
You don’t have to use the same pool name, but if you don’t, you need to use a profile on each side which contains the root device rather than the container directly having it.
That way the exact container config of the source will work on the target.
Alternatively, passing --storage NAME as an argument to lxc copy may also work.
The first copy worked (the zpool names were a match between target and destination). The second copy I switched to a second pool I have available on the target, and it failed with error:
Error: Failed instance creation:
https://10.30.70.1:48443: Error transferring instance data: Unable to connect to: 10.30.70.1:48443
https://192.168.1.15:48443: Error transferring instance data: Create instance: Find parent volume: No such object
So that seems to answer that. I will try the different profile when I get more time and I will post an update here. Personally I am ok keeping pool names the same going forward to keep this simple (and more importantly functioning), which is probably a good stanardization idea anyway. FYI I am running lxd 4.2.
The case is identical to the reports above: I am running ZFS across all hosts. All hosts have a pool called ‘default’. Copying between those works. One host also has a pool called ‘backuplxd’, copying with snapshots to that fails with Error transferring instance data: Create instance: Find parent volume: No such object despite --storage backuplxd when not using --instance-only.