Failed adding NIC netdev: Monitor is disconnected

Hi,

I just upgraded qemu to 7.2.0. Since then, lxd cannot create VMs with networking. Error indicator: Monitor is disconnected. My searches for prior documentation point at a QMP-related issue.

Below, trying to launch a VM attached to the lxdbr0 network.

# lxc launch --vm images:alpine/edge -c security.secureboot=false -n lxdbr0
Creating the instance
Instance name is: loyal-midge
Starting loyal-midge
Error: Failed setting up device via monitor: Failed setting up device "eth0": Failed adding NIC netdev: Monitor is disconnected
Try `lxc info --show-log local:loyal-midge` for more info

With a default profile without networking:

# lxc launch --vm images:alpine/edge -c security.secureboot=false
Creating the instance
Instance name is: thankful-gannet

The instance you are starting doesn't have any network attached to it.
  To create a new network, use: lxc network create
  To attach a network to an instance, use: lxc network attach

Starting thankful-gannet

Rebooting the VM with a nic.

# lxc stop thankful-gannet
# lxc network attach lxdbr0 thankful-gannet
# lxc start thankful-gannet
Error: Failed setting up device via monitor: Failed setting up device "lxdbr0": Failed adding NIC netdev: Monitor is disconnected
Try `lxc info --show-log thankful-gannet` for more info

lxd debug output

DEBUG  [2022-12-17T18:15:37Z] Skipping lxd-agent install as unchanged       installPath=/var/lib/lxd/virtual-machines/thankful-gannet/config/lxd-agent instance=thankful-gannet instanceType=virtual-machine project=default srcPath=/opt/lxd-5.9/bin/lxd-agent
DEBUG  [2022-12-17T18:15:37Z] Starting device                               device=lxdbr0 instance=thankful-gannet instanceType=virtual-machine project=default type=nic
DEBUG  [2022-12-17T18:15:37Z] Starting device                               device=root instance=thankful-gannet instanceType=virtual-machine project=default type=disk
DEBUG  [2022-12-17T18:15:37Z] UpdateInstanceBackupFile started              instance=thankful-gannet project=default
DEBUG  [2022-12-17T18:15:37Z] Skipping unmount as in use                    driver=dir pool=default refCount=1 volName=thankful-gannet
DEBUG  [2022-12-17T18:15:37Z] UpdateInstanceBackupFile finished             instance=thankful-gannet project=default
DEBUG  [2022-12-17T18:15:37Z] Instance operation lock finished              action=start err="Failed setting up device \"lxdbr0\": Failed adding NIC netdev: Monitor is disconnected" instance=thankful-gannet project=default reusable=false
WARNING[2022-12-17T18:15:37Z] Failed to collect VM process exit status      instance=thankful-gannet instanceType=virtual-machine pid=1214572 project=default
DEBUG  [2022-12-17T18:15:37Z] Stopping device                               device=root instance=thankful-gannet instanceType=virtual-machine project=default type=disk
DEBUG  [2022-12-17T18:15:37Z] Stopping device                               device=lxdbr0 instance=thankful-gannet instanceType=virtual-machine project=default type=nic
DEBUG  [2022-12-17T18:15:37Z] UnmountInstance started                       instance=thankful-gannet project=default
DEBUG  [2022-12-17T18:15:37Z] UnmountInstance finished                      instance=thankful-gannet project=default
DEBUG  [2022-12-17T18:15:37Z] Start finished                                instance=thankful-gannet instanceType=virtual-machine project=default stateful=false
DEBUG  [2022-12-17T18:15:37Z] Failure for operation                         class=task description="Starting instance" err="Failed setting up device via monitor: Failed setting up device \"lxdbr0\": Failed adding NIC netdev: Monitor is disconnected" operation=9fa644a7-4e92-4dbf-a639-548e638e69d4 project=default

Other information

# qemu-system-x86_64 --version
QEMU emulator version 7.2.0 (Debian 1:7.2+dfsg-1)
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

What does
lxc info --show-log local:loyal-midge
show?

What lxd version?

Log:

warning: tap: open vhost char device failed: Permission denied
warning: tap: open vhost char device failed: Permission denied
qemu-system-x86_64: ../../net/net.c:1106: net_client_init1: Assertion `nc' failed.

The lxd snap comes with qemu 7.1.0 so lxd 5.9 should be fine.

Indeed. It works on a Debian bullseye machine with qemu 7.1.0 (backports) and a non-snap build of lxd 5.9. The problematic machine runs Debian sid with lxd 5.9 also. If it matters, the lxd builds are at GitHub - antifob/lxd-ci: LXD builds

edit: 7.2.0 → 7.1.0

Following up on this. I guess I can either patch qemu myself or wait for it to be picked up. :slight_smile: Thanks @tomp

https://www.mail-archive.com/qemu-devel@nongnu.org/msg924611.html

1 Like

mmh, Debian bullseye’s backports ships qemu 7.1.0, not 7.2.0.

@tomp Are you certain that the snap pkg ships 7.2.0 ?

1 Like

I thought it was 7.2.0, but clearly mistaken. Must have mis-remembered. Anyway sounds like a bug in qemu that will get fixed hopefully.

The lxd snap will likely have to carry that patch if it doesnt get fixed when we move to >7.1.0

Upstream issue https://gitlab.com/qemu-project/qemu/-/issues/1486

Issue also posted here https://github.com/lxc/lxd/issues/11482

@pgregoire, strangely I started to see the issue now, have you come up with any workaround?
@tomp did some latest commits from the latest/edge channel changed qemu version to 7.2?

Thanks

This is because the latest/edge channel has been updated to QEMU 8.0 which still has the issue.
We have a workaround at the moment (disable the vhost-dev accelerator option), but we are investigating a proper solution. This should also solve the QEMU 7.2 issue too.

Are you planning to apply the workaround to edge snap before the weekend or should one use that?

sudo usermod -a -G kvm lxd

as described in https://github.com/lxc/lxd/issues/11482#issuecomment-1511561446

This PR works around the issue temporarily by disabling the vhost accelerator, which we plan to merge shortly:

Hi,

Sorry for the question but is there any chance you will include the commit from this PR in snap edge channel anytime soon?

Unfortunately there is no option to temporarily downgrade to stable channel due to differences in db backend :neutral_face: and right now vms in edge channel have been unusable since Friday noon

Thanks

It will be included in next snap latest/edge rebuild automatically.
This should be imminently, but it depends on the queue for snap rebuilds.

If you’re relying on LXD to function then we wouldn’t recommend running on the latest/edge channel as its inherently unstable (although it is gated via our normal test suite, but this doesn’t include VM operations).

See Managing the LXD snap

By the way, is there any documentation on how one can build/test lxd snap to possibly contribute :slight_smile: ?
I assume there must be some example snapcraft.yaml somewhere or am I wrong?

You actually don’t need to build the snap, just the lxd binary and sideload it into the snap, see How to create LXD binaries from the source code and side load them in an existing snap installation

Thank you for the information, now I am using the hold option but it’s crying over spilt milk I guess.

Is there any option to temporarily move the existing envitonmet to the stable channel to work there?
I found an old post saying that downgrading to stable is impossible due to db backend differences. Has anything changed in this regard?

Thanks