Can't start a container after running lxc copy --stateless --refresh


(Bruce) #1

I’m copying a container from a remote to a local with this command

lxc copy --stateless --refresh remote:c1 c1

The command runs fine but then the container won’t start.

If I delete the container on the local host and just run

lxc copy --stateless remote:c1 c1

Then everything works fine but I’d like to include --refresh to get some incremental copies going…

lxc info --show-log c1 shows the following, the key line seems to be

lxc c1 20190329110109.406 ERROR    dir - storage/dir.c:dir_mount:198 - No such file or directory - Failed to mount "/var/snap/lxd/common/lxd/containers/c1/rootfs" on "/var/snap/lxd/common/lxc/"

here’s the whole output

lxc c1 20190329110109.388 WARN     conf - conf.c:lxc_map_ids:2970 - newuidmap binary is missing
lxc c1 20190329110109.388 WARN     conf - conf.c:lxc_map_ids:2976 - newgidmap binary is missing
lxc c1 20190329110109.394 WARN     conf - conf.c:lxc_map_ids:2970 - newuidmap binary is missing
lxc c1 20190329110109.394 WARN     conf - conf.c:lxc_map_ids:2976 - newgidmap binary is missing
lxc c1 20190329110109.406 ERROR    dir - storage/dir.c:dir_mount:198 - No such file or directory - Failed to mount "/var/snap/lxd/common/lxd/containers/c1/rootfs" on "/var/snap/lxd/common/lxc/"
lxc c1 20190329110109.406 ERROR    conf - conf.c:lxc_mount_rootfs:1351 - Failed to mount rootfs "/var/snap/lxd/common/lxd/containers/c1/rootfs" onto "/var/snap/lxd/common/lxc/" with options "(null)"
lxc c1 20190329110109.406 ERROR    conf - conf.c:lxc_setup_rootfs_prepare_root:3498 - Failed to setup rootfs for
lxc c1 20190329110109.406 ERROR    conf - conf.c:lxc_setup:3551 - Failed to setup rootfs
lxc c1 20190329110109.406 ERROR    start - start.c:do_start:1282 - Failed to setup container "c1"
lxc c1 20190329110109.406 ERROR    sync - sync.c:__sync_wait:62 - An error occurred in another process (expected sequence number 5)
lxc c1 20190329110109.406 WARN     network - network.c:lxc_delete_network_priv:2589 - Operation not permitted - Failed to remove interface "eth0" with index 30
lxc c1 20190329110109.406 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:864 - Received container state "ABORTING" instead of "RUNNING"
lxc c1 20190329110109.407 ERROR    start - start.c:__lxc_start:1975 - Failed to spawn container "c1"
lxc c1 20190329110109.408 WARN     conf - conf.c:lxc_map_ids:2970 - newuidmap binary is missing
lxc c1 20190329110109.408 WARN     conf - conf.c:lxc_map_ids:2976 - newgidmap binary is missing
lxc 20190329110109.414 WARN     commands - commands.c:lxc_cmd_rsp_recv:132 - Connection reset by peer - Failed to receive response for command "get_state"

Does anyone know what the problem might be?


(Stéphane Graber) #2

Can you show:

  • ls -lh /var/snap/lxd/common/mntns/var/snap/lxd/common/lxd/containers
  • ls -lh /var/snap/lxd/common/mntns/var/snap/lxd/common/lxd/storage-pools/default/containers/

(Bruce) #3

On the remote

This command

ls -lh /var/snap/lxd/common/mntns/var/snap/lxd/common/lxd/containers

Gives a bunch of output for each container e.g.

lrwxrwxrwx 1 root root 59 Dec 10 10:07 haproxy -> /var/snap/lxd/common/lxd/storage-pools/2/containers/haproxy

While this command

ls -lh /var/snap/lxd/common/mntns/var/snap/lxd/common/lxd/storage-pools/default/containers/

Returns

ls: cannot access '/var/snap/lxd/common/mntns/var/snap/lxd/common/lxd/storage-pools/default/containers/': No such file or directory

On the host

ls -lh /var/snap/lxd/common/mntns/var/snap/lxd/common/lxd/containers

Returns

total 0

And

ls -lh /var/snap/lxd/common/mntns/var/snap/lxd/common/lxd/storage-pools/default/containers/

Also returns

total 0


(Stéphane Graber) #4

Ah, so I wonder if the difficultly here is about the storage pool being called differently on source and target, in a normal one time migration, this gets re-shuffled a bit for you, but when running refresh, the config is re-synced and something may be going wrong there.

Any chance you can try in a setup where the storage pools have the same names on both end? That’d confirm the hypothesis and would make the issue easier to reproduce.


(Stéphane Graber) #5

Not having any luck reproducing this issue with similarly differently named storage pools, things work just fine here.

What storage backends are in use on source and destination?


(Bruce) #6

Not having any luck reproducing this issue with similarly differently named storage pools, things work just fine here.

Hmm… the host pool is three-way-mirror and the remote is one-way. I guess you’re saying that having different names shouldn’t be a problem?

Do you want me to try when both storage pools have the same name?

What storage backends are in use on source and destination?

They’re both zfs but zfs was setup before lxd was initialised. Both zfs configurations should be the same, aside from the fact that three-way-mirror has 3 disks in it and one-way has just one…


(Bruce) #7

Do you think this issue is related to the fact that both lxd instances were connected to existing zfs pools, rather than letting lxd setup zfs itself?

When initialising lxd, I answered these 2 questions as below…

Create a new ZFS pool? (yes/no) [default=yes]: no
Name of the existing ZFS pool or dataset: three-way-mirror

Do you think this might play a role in the problem?


(Stéphane Graber) #8

That part shouldn’t matter, no but I don’t think I tried with zfs on both ends, so I should try that.

Testing with the same name on both end would be useful, yes.


(Bruce) #9

ok, cool, I’ll try with both pools having the same name…


(Bruce) #10

Ok, I tried with both pools having the same name and everything seems to have worked… I wonder what the problem might have been?