LXD Live Migration Problem

Hi!,

I have a 3 node cluster (Ubuntu 20.04.2 LTS) connect with a 4 node Ceph storage (1 mon - mgr, and 3 osd) ubuntu 20.04.2 LTS too.

This is the configuration:

±---------±--------------------------±---------±-------------±---------------±------------±-------±------------------+
| NAME | URL | DATABASE | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATE | MESSAGE |
±---------±--------------------------±---------±-------------±---------------±------------±-------±------------------+
| ubuntu01 | https://10.200.10.90:8443 | YES | x86_64 | default | | ONLINE | Fully operational |
±---------±--------------------------±---------±-------------±---------------±------------±-------±------------------+
| ubuntu02 | https://10.200.10.91:8443 | YES | x86_64 | default | | ONLINE | Fully operational |
±---------±--------------------------±---------±-------------±---------------±------------±-------±------------------+
| ubuntu03 | https://10.200.10.92:8443 | YES | x86_64 | default | | ONLINE | Fully operational |
±---------±--------------------------±---------±-------------±---------------±------------±-------±------------------+

Storage:
±-------±-------±------------±--------±--------+
| NAME | DRIVER | DESCRIPTION | USED BY | STATE |
±-------±-------±------------±--------±--------+
| remote | ceph | | 6 | CREATED |
±-------±-------±------------±--------±--------+

This is the network configuration:

root@ubuntu03:~# lxc network list
±------±---------±--------±-----±-----±------------±--------±------+
| NAME | TYPE | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY | STATE |
±------±---------±--------±-----±-----±------------±--------±------+
| br0 | bridge | NO | | | | 4 | |
±------±---------±--------±-----±-----±------------±--------±------+
| ens18 | physical | NO | | | | 0 | |
±------±---------±--------±-----±-----±------------±--------±------+

this are the test containers in the cluster:

±----------±--------±---------------------±-----±----------±----------±---------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION |
±----------±--------±---------------------±-----±----------±----------±---------+
| c3 | RUNNING | 10.200.10.245 (eth0) | | CONTAINER | 0 | ubuntu01 |
±----------±--------±---------------------±-----±----------±----------±---------+
| c5 | RUNNING | 10.200.10.141 (eth0) | | CONTAINER | 1 | ubuntu03 |
±----------±--------±---------------------±-----±----------±----------±---------+
| lxdMosaic | RUNNING | 10.200.10.193 (eth0) | | CONTAINER | 0 | ubuntu02 |
±----------±--------±---------------------±-----±----------±----------±---------+

The version of LXD is 4.16 in all nodes

root@ubuntu03:~# lxd --version
4.16

From lxdMosaic when i try to do a live migration of a running ubuntu container, i get this error:

Server error: POST https://10.200.10.91:8443/1.0/containers?project=default resulted in a 500 Internal Server Error response: {“error”:“Failed creating instance record: Add instance info to the database: This instance already exists”,“error_code” (truncated…) /var/www/LxdMosaic/vendor/php-http/guzzle6-adapter/src/Promise.php 127

Any ideas?.

Thanks a lot.

I have not played around with clusters, but the API documentation mentions about passing target in the URL for the specific node. In the error message it does not seem to be included and maybe that is why you are getting an error because it does not know you just want to move nodes.

This looks like the error, does an instance with the same name already it exist on the target host?

Have you verified the migration works with the LXC command? Live migration isn’t particularly well supported;

Thanks for the reply, this is a update on the case:

This is my remote lxc servers list:

root@ubuntu02:~# lxc remote list
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| NAME | URL | PROTOCOL | AUTH TYPE | PUBLIC | STATIC | GLOBAL |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| images | https://images.linuxcontainers.org | simplestreams | none | YES | NO | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| local (current) | unix:// | lxd | file access | NO | YES | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| ubuntu | https://cloud-images.ubuntu.com/releases | simplestreams | none | YES | YES | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| ubuntu01 | https://10.200.10.90:8443 | lxd | tls | NO | NO | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| ubuntu02 | https://10.200.10.91:8443 | lxd | tls | NO | NO | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| ubuntu03 | https://10.200.10.92:8443 | lxd | tls | NO | NO | NO |
±----------------±-----------------------------------------±--------------±------------±-------±-------±-------+
| ubuntu-daily | https://cloud-images.ubuntu.com/daily | simplestreams | none | YES | YES | NO |

The actual state of the containers is:root@ubuntu02:~# lxc list
±----------±--------±---------------------±-----±----------±----------±---------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION |
±----------±--------±---------------------±-----±----------±----------±---------+
| c3 | RUNNING | 10.200.10.245 (eth0) | | CONTAINER | 0 | ubuntu01 |
±----------±--------±---------------------±-----±----------±----------±---------+
| c5 | STOPPED | | | CONTAINER | 1 | ubuntu03 |
±----------±--------±---------------------±-----±----------±----------±---------+
| lxdMosaic | RUNNING | 10.200.10.193 (eth0) | | CONTAINER | 0 | ubuntu02 |
±----------±--------±---------------------±-----±----------±----------±---------+
root@ubuntu02:~#

I’m trying to migrate in stop state “c5” instance:

root@ubuntu02:~# lxc info c5
Name: c5
Location: ubuntu03
Remote: unix://
Architecture: x86_64
Created: 2021/08/08 03:18 UTC
Status: Stopped
Type: container
Profiles: default
Snapshots:
snap0 (taken at 2021/08/08 08:26 UTC) (stateless)
root@ubuntu02:~

And this is the output for the lxc command:

root@ubuntu02:~# lxc copy c5/snap0 ubuntu03:c5 --verbose
Error: Failed instance creation: Failed creating instance record: Add instance info to the database: This instance already exists
root@ubuntu02:~#

apparently it shows a container existence error on the target server, but it doesn’t.

Did you check what @turtle0x1 asked? to see if C5 exists on ubuntu03?

Hi @Jimbo.

I am new to handling lxc / lxd.
How can I know if that container exists?, apart from executing: lxc list.

the info of the container is:

root@ubuntu02:~# lxc info c5
Name: c5
Location: ubuntu03
Remote: unix://
Architecture: x86_64
Created: 2021/08/08 03:18 UTC
Status: Stopped
Type: container
Profiles: default
Snapshots:
snap0 (taken at 2021/08/08 08:26 UTC) (stateless)
root@ubuntu02:~

I do the same command on the 3 node servers of the cluster:

root@ubuntu02:~# lxc copy c5/snap0 ubuntu03:c3 --verbose
Error: Failed instance creation: Failed creating instance record: Add instance info to the database: This instance already exists

root@ubuntu02:~# lxc copy c5/snap0 ubuntu01:c5 --verbose
Error: Failed instance creation: Failed creating instance record: Add instance info to the database: This instance already exists

root@ubuntu02:~# lxc copy c5/snap0 ubuntu02:c5 --verbose
Error: Failed instance creation: Failed creating instance record: Add instance info to the database: This instance already exists

Same error.

Thanks for your prompt response.

To list what is on the node do lxc list. lxc copy --help shows that there is a target option for a specific cluster node. Not sure how this is all suppose to work, never got round to setting up a cluster to play with it.

This suggest the container already exists on ubuntu03:

what does lxc list ubuntu03: output?