Does LXD support live migration?

Hi ,
does lxd 4.5 support live migration? if yes please help me with how ?

LXD from the snap package supports live migration. It uses CRIU, and you need to enable it first.
Run snap info lxd to see the configuration option.

But you mention LXD 4.5. Perhaps you run LXD from one other packaging? The snap package is now at version 4.6.

Also worth noting that CRIU, while supported, very rarely works.
We’re hoping to do a bit of work over the coming months to make it work with a much wider set of base container images.

We never expect it to just work for people, but we should at least make it easier to make it work for your particular workload and some of today’s limitations with it are just deal breakers for most.

1 Like

Also worth noting that this is for containers.

Live migration of virtual machine is comparatively much easier, though we do not support it yet. We have preliminary work for it planned for the coming months with the ability to take stateful snapshots and perform stateful stops of VMs which we’ll then follow with full live migration.

1 Like

hi , I not using snap version ,using lxd version which compiled manually from source .I have installed CRIU and trying lxc move . getting below error
root@cpu-6225:~# lxc move test cpu-5228:
Error: Failed instance creation:

  • https://54.37.245.179:8448: Error transferring instance data: Failed to run: /root/go/bin/lxd forkmigrate test /var/lib/lxd/containers /var/log/lxd/test/lxc.conf /tmp/lxd_restore_974933999/final true:
  • https://10.145.151.1:8448: Error transferring instance data: Unable to connect to: 10.145.151.1:8448
  • https://[fd42:7420:5c84:7e2e::1]:8448: Error transferring instance data: Unable to connect to: [fd42:7420:5c84:7e2e::1]:8448

do I need to change any config for this ? as it is searching for private Ip ?

No, the first failure happened on public IP. Look at /var/log/lxd/test/ for a CRIU log.

lxd log from source server

t=2020-10-06T07:24:49+0000 lvl=info msg="Migrating container" actionscript=false created=2020-10-06T07:15:24+0000 ephemeral=false features=0 name=test predumpdir= project=default statedir=/tmp/lxd_checkpoint_589940025 stop=false used=2020-10-06T07:15:25+0000
t=2020-10-06T07:24:49+0000 lvl=info msg="Migrated container" actionscript=false created=2020-10-06T07:15:24+0000 ephemeral=false features=0 name=test predumpdir= project=default statedir=/tmp/lxd_checkpoint_589940025 stop=false used=2020-10-06T07:15:25+0000
t=2020-10-06T07:27:34+0000 lvl=info msg="Migrating container" actionscript=false created=2020-10-06T07:15:24+0000 ephemeral=false features=1 name=test predumpdir= project=default statedir= stop=false used=2020-10-06T07:15:25+0000
t=2020-10-06T07:27:36+0000 lvl=info msg="Migrating container" actionscript=false created=2020-10-06T07:15:24+0000 ephemeral=false features=0 name=test predumpdir= project=default statedir=/tmp/lxd_checkpoint_971198276 stop=false used=2020-10-06T07:15:25+0000
t=2020-10-06T07:27:36+0000 lvl=info msg="Migrated container" actionscript=false created=2020-10-06T07:15:24+0000 ephemeral=false features=0 name=test predumpdir= project=default statedir=/tmp/lxd_checkpoint_971198276 stop=false used=2020-10-06T07:15:25+0000```

lxd log from destination server

t=2020-10-06T07:27:57+0000 lvl=info msg="Created container" ephemeral=false name=test project=default
t=2020-10-06T07:28:07+0000 lvl=info msg="Deleting container" created=2020-10-06T07:27:57+0000 ephemeral=false name=test project=default used=1970-01-01T00:00:00+0000
t=2020-10-06T07:28:07+0000 lvl=info msg="Deleted container" created=2020-10-06T07:27:57+0000 ephemeral=false name=test project=default used=1970-01-01T00:00:00+0000
t=2020-10-06T07:28:50+0000 lvl=info msg="Stopping container" action=stop created=2020-10-02T17:40:19+0000 ephemeral=false name=vm727321 project=default stateful=false used=2020-10-05T13:08:44+0000
t=2020-10-06T07:29:00+0000 lvl=eror msg="Failed to stop device 'vm727321': Failed to unmount '/var/lib/lxd/storage-pools/default/custom/default_vm727321': device or resource busy"
t=2020-10-06T07:29:00+0000 lvl=warn msg="Failed getting list of tables from \"/proc/self/net/ip6_tables_names\", assuming all requested tables exist"
t=2020-10-06T07:29:00+0000 lvl=warn msg="Failed getting list of tables from \"/proc/self/net/ip6_tables_names\", assuming all requested tables exist"
t=2020-10-06T07:29:01+0000 lvl=info msg="Stopped container" action=stop created=2020-10-02T17:40:19+0000 ephemeral=false name=vm727321 project=default stateful=false used=2020-10-05T13:08:44+0000
t=2020-10-06T07:29:05+0000 lvl=info msg="Starting container" action=start created=2020-10-02T17:40:19+0000 ephemeral=false name=vm727321 project=default stateful=false used=2020-10-05T13:08:44+0000
t=2020-10-06T07:29:05+0000 lvl=info msg="Started container" action=start created=2020-10-02T17:40:19+0000 ephemeral=false name=vm727321 project=default stateful=false used=2020-10-05T13:08:44+0000

any help on this ?

Error: Failed instance creation:
 - https://54.37.245.179:8448: Error transferring instance data: rsync failed to spawn after 10s (rsync: getcwd(): No such file or directory (2)
rsync error: errors selecting input/output files, dirs (code 3) at util.c(1221) [Receiver=3.1.3]
)
 - https://10.145.151.1:8448: Error transferring instance data: Unable to connect to: 10.145.151.1:8448
 - https://[fd42:7420:5c84:7e2e::1]:8448: Error transferring instance data: Unable to connect to: [fd42:7420:5c84:7e2e::1]:8448

Look at the path I gave you, not the main lxd.log

lxc test 20201006071525.545 WARN     conf - conf.c:lxc_setup_devpts:1616 - Invalid argument - Failed to unmount old devpts instance

I can see here lxd.log is not present only I have lxc.log file

console.log    migration_pre-dump_2020-10-06T07:16:51Z.log  migration_pre-dump_2020-10-06T11:01:22Z.log
forkstart.log  migration_pre-dump_2020-10-06T07:24:49Z.log  migration_pre-dump_2020-10-06T11:03:34Z.log
lxc.conf       migration_pre-dump_2020-10-06T07:27:36Z.log  migration_pre-dump_2020-10-06T11:10:00Z.log
lxc.log        migration_pre-dump_2020-10-06T09:27:56Z.log  migration_pre-dump_2020-10-06T11:14:07Z.log
lxc.log.old    migration_pre-dump_2020-10-06T10:58:15Z.log  netcat.log

these are the available files under this path

Look at the most recent pre-dump

(00.114550) irmap:      checking 19:5074
(00.114553) fsnotify: Opening fhandle 19:100005074...
(00.114556) fsnotify:   Handle 0x19:0x5074 is openable
(00.114559) fsnotify:           Trying via mntid 428 root /lxc/test ns_mountpoint @./sys/fs/cgroup/unified (11)
(00.114562) fsnotify:                   link as sys/fs/cgroup/unified/system.slice/dev-full.mount/cgroup.events
(00.114565) fsnotify:                   openable (inode match) as sys/fs/cgroup/unified/system.slice/dev-full.mount/cgroup.events
(00.114566) fsnotify:   Dumping /sys/fs/cgroup/unified/system.slice/dev-full.mount/cgroup.events as path for handle
(00.114567) irmap: Irmap cache 19:5074 -> /sys/fs/cgroup/unified/system.slice/dev-full.mount/cgroup.events
(00.114568) irmap:      checking 19:507f
(00.114569) fsnotify: Opening fhandle 19:10000507f...
(00.114572) fsnotify:   Handle 0x19:0x507f is openable
(00.114575) fsnotify:           Trying via mntid 428 root /lxc/test ns_mountpoint @./sys/fs/cgroup/unified (11)
(00.114579) fsnotify:                   link as sys/fs/cgroup/unified/system.slice/dev-net-tun.mount/cgroup.events
(00.114582) fsnotify:                   openable (inode match) as sys/fs/cgroup/unified/system.slice/dev-net-tun.mount/cgroup.events
(00.114582) fsnotify:   Dumping /sys/fs/cgroup/unified/system.slice/dev-net-tun.mount/cgroup.events as path for handle
(00.114583) irmap: Irmap cache 19:507f -> /sys/fs/cgroup/unified/system.slice/dev-net-tun.mount/cgroup.events
(00.114584) irmap:      checking 19:508a
(00.114585) fsnotify: Opening fhandle 19:10000508a...
(00.114589) fsnotify:   Handle 0x19:0x508a is openable
(00.114592) fsnotify:           Trying via mntid 428 root /lxc/test ns_mountpoint @./sys/fs/cgroup/unified (11)
(00.114595) fsnotify:                   link as sys/fs/cgroup/unified/system.slice/dev-random.mount/cgroup.events
(00.114598) fsnotify:                   openable (inode match) as sys/fs/cgroup/unified/system.slice/dev-random.mount/cgroup.events
(00.114599) fsnotify:   Dumping /sys/fs/cgroup/unified/system.slice/dev-random.mount/cgroup.events as path for handle
(00.114600) irmap: Irmap cache 19:508a -> /sys/fs/cgroup/unified/system.slice/dev-random.mount/cgroup.events
(00.114601) irmap:      checking 19:50a0
(00.114602) fsnotify: Opening fhandle 19:1000050a0...
(00.114605) fsnotify:   Handle 0x19:0x50a0 is openable
(00.114608) fsnotify:           Trying via mntid 428 root /lxc/test ns_mountpoint @./sys/fs/cgroup/unified (11)
(00.114611) fsnotify:                   link as sys/fs/cgroup/unified/system.slice/systemd-tmpfiles-setup.service/cgroup.events
(00.114614) fsnotify:                   openable (inode match) as sys/fs/cgroup/unified/system.slice/systemd-tmpfiles-setup.service/cgroup.events
(00.114615) fsnotify:   Dumping /sys/fs/cgroup/unified/system.slice/systemd-tmpfiles-setup.service/cgroup.events as path for handle
(00.114616) irmap: Irmap cache 19:50a0 -> /sys/fs/cgroup/unified/system.slice/systemd-tmpfiles-setup.service/cgroup.events
(00.114622) Writing image inventory (version 1)
(00.114633) Writing stats
(00.114640) Pre-dumping finished successfully

log says pre -dumping finished successfully

these are the steps I did

  • installed lxd from source
  • installed criu from source
  • added remote server with lxc remote add
  • tested a offline migration and seems working fine
  • and started live migration

anything wrong about these ?

What storage backend are you using?

using zfs

Ok, try again with lxc move --mode=relay but I suspect you’ll get the same rsync error.

root@cpu-6225:~# lxc move --mode=relay test cpu-5228:
Error: Error transferring instance data: Failed to run: /root/go/bin/lxd forkmigrate test /var/lib/lxd/containers /var/log/lxd/test/lxc.conf /tmp/lxd_restore_714685427/final true: