environment
lxd host os(lxc cluster): Ubuntu 22.04 aarch64 (all three nodes are this os and cpu architecture)
lxc version: 5.5
lxd version: 5.5
storage driver: ceph 17.2.0
my problem
I created a virtual machine with an image(ubuntu/focal/cloud).
when i start it, it turn to ERROR status, after a few seconds, the status becomes RUNNING automatically .It took me 1 minute to start my virtual machine.
Then I downloaded an new ubuntu image without cloud tag and create a virtual machine based on new ubuntu image, when i start it ,It becomes RUNNING directly.It took me 15 seconds to start my virtual machine.Error status does not appear.
here is my lxc info --show-log result
root@lxdserver1:~# lxc info --show-log rapid-snail
Name: rapid-snail
Status: ERROR
Type: virtual-machine
Architecture: aarch64
Location: lxdserver2
PID: 36977
Created: 2023/01/11 09:23 UTC
Last Used: 2023/01/11 09:24 UTC
Resources:
Processes: 0
Disk usage:
root: 80.00MiB
Log:
warning: tap: open vhost char device failed: Permission denied
warning: tap: open vhost char device failed: Permission denied
qemu-system-aarch64: warning: 9p: degraded performance: a reasonable high msize should be chosen on client/guest side (chosen msize is <= 8192). See https://wiki.qemu.org/Documentation/9psetup#msize for details.
here is my dmesg result,I found a DENY, but I don’t know if it is the cause of ERROR
[166457.937437] device tapf83ab24b entered promiscuous mode
[166457.961209] audit: type=1400 audit(1673429326.360:144): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-rapid-snail_</var/snap/lxd/common/lxd>" pid=38280 comm="apparmor_parser"
[166458.010720] audit: type=1400 audit(1673429326.408:145): apparmor="DENIED" operation="open" profile="lxd-rapid-snail_</var/snap/lxd/common/lxd>" name="/proc/38281/cpuset" pid=38281 comm="lxd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[166458.751950] lxdbr0: port 1(tapf83ab24b) entered blocking state
[166458.751963] lxdbr0: port 1(tapf83ab24b) entered forwarding state
[166464.826112] device tapc3837c3d left promiscuous mode
I still want to use the first image (the image marked with cloud tag). Who can tell me how to avoid ERROR status ?
I don’t think my image is damaged , because I created a new ceph storage pool, and
use the same Ubuntu image with cloud tag, it still ERROR.
Is it about LXD-agent? or cloud-init? or something else?
what is the different of my two image ?
Is the image with cloud tag execute more steps than the image without cloud tag?
So are you saying both VMs start OK, its just one takes longer and for a while shows in ERROR status?
Are you able to reproduce this on LXD 5.10, as LXD 5.5 isn’t supported anymore.
Yeah, that’s exactly what I mean.
I just upgraded to LXD 5.10 and start VM again.
This time, my VM has been stopped in ERROR status instead of Running.
When using images:ubuntu/focal/cloud, the status of the VM becomes ERROR when lxc start. A few minutes later, that vm starts up successfully , and status is RUNNING.
The problem is that it took too long to start VM and status is ERROR for a while.
when error apear,other request (just like lxc ls ) to lxc will hang.
I have always used this image (images:ubuntu/focal/cloud), It seems that I need to change to images:ubuntu/jammy/cloud.
I have a problem when using this image
root@lxdserver1:~# lxc image list "images:d1535"
+-----------------------------+--------------+--------+-------------------------------------+--------------+-----------------+----------+-------------------------------+
| ALIAS | FINGERPRINT | PUBLIC | DESCRIPTION | ARCHITECTURE | TYPE | SIZE | UPLOAD DATE |
+-----------------------------+--------------+--------+-------------------------------------+--------------+-----------------+----------+-------------------------------+
| ubuntu/focal/cloud (3 more) | d15350bf6ae8 | yes | Ubuntu focal arm64 (20230115_07:43) | aarch64 | VIRTUAL-MACHINE | 251.29MB | Jan 15, 2023 at 12:00am (UTC) |
+-----------------------------+--------------+--------+-------------------------------------+--------------+-----------------+----------+-------------------------------+
root@lxdserver1:~#
here is my lxc image list result:
root@lxdserver1:~# lxc image list
+--------------------------------------+--------------+--------+--------------------------------------------------------------+--------------+-----------------+-----------+-------------------------------+
| ALIAS | FINGERPRINT | PUBLIC | DESCRIPTION | ARCHITECTURE | TYPE | SIZE | UPLOAD DATE |
+--------------------------------------+--------------+--------+--------------------------------------------------------------+--------------+-----------------+-----------+-------------------------------+
| Ubuntu_Focal | 541ac7fec8c0 | no | Ubuntu focal arm64 (20230115_07:44) | aarch64 | VIRTUAL-MACHINE | 234.52MB | Jan 17, 2023 at 11:42am (UTC) |
+--------------------------------------+--------------+--------+--------------------------------------------------------------+--------------+-----------------+-----------+-------------------------------+
| Ubuntu_Focal_Cloud | d15350bf6ae8 | no | Ubuntu focal arm64 (20230115_07:43) | aarch64 | VIRTUAL-MACHINE | 251.29MB | Jan 17, 2023 at 12:04pm (UTC) |
+--------------------------------------+--------------+--------+--------------------------------------------------------------+--------------+-----------------+-----------+-------------------------------+
| Ubuntu_Jammy | 4ed838f8a5c3 | no | Ubuntu jammy arm64 (20230116_08:50) | aarch64 | VIRTUAL-MACHINE | 254.91MB | Jan 17, 2023 at 10:28am (UTC) |
+--------------------------------------+--------------+--------+--------------------------------------------------------------+--------------+-----------------+-----------+-------------------------------+
| Ubuntu_Jammy_Cloud | a73dc51e9ea5 | no | Ubuntu jammy arm64 (20230115_07:43) | aarch64 | VIRTUAL-MACHINE | 280.17MB | Jan 17, 2023 at 11:08am (UTC) |
Can you run lxc monitor --type=logging --pretty in a separate window and then try and start the problem VM again, as it will be interesting to see if that highlights where the delay is occurring.
WARNING[2023-01-18T00:48:06Z] Unable to use virtio-fs for config drive, using 9p as a fallback err="Architecture unsupported" instance=intimate-jackal instanceType=virtual-machine project=default
i watch the vm console and lxc monitor --type=logging --pretty at the same time.
About one minute after I execute lxd start xxxx --console,the following log appears in the console
WARNING[2023-01-18T00:48:06Z] Unable to use virtio-fs for config drive, using 9p as a fallback err="Architecture unsupported" instance=intimate-jackal instanceType=virtual-machine project=default
appears in the monitor terminal, It is highlighted.
The disk-kvm.img images which are to be preferred when run under virtualization, currently completely fail to boot under UEFI.
A workaround was put in place such that LXD instead will pull generic-based images until this is resolved, this however does come with a much longer boot time (as the kernel panics, reboots and then boots) and also reduced functionality from cloud-init, so we’d still like this fixed in the near future.