Incus agent fails to start

l33tname · February 7, 2024, 10:19am

incus launch images:debian/11 --vm

incus console keen-mongoose

Booting `Debian GNU/Linux'

Loading Linux 5.10.0-27-amd64 ...
Loading initial ramdisk ...
[FAILED] Failed to start Incus - agent.

Is there a way to see why the agent fails?

I tested a random ubuntu image which seems to work:

incus launch images:ubuntu/jammy/cloud ubuntu-jammy-vm --vm

stgraber · February 7, 2024, 11:17am

Hmm, that exact same VM works fine here, what’s your host?

It’s possible to do some debugging by stopping the VM, map the block device of the VM, mount the second partition from that, set a root password, unmount and start the VM, at which point you can login locally fine.

l33tname · February 7, 2024, 12:16pm

We used incus exec NAME -- sh -c 'systemctl isolate multi-user.target' as check if the vm is ready.
Which seems to be the issue. But good to know that i could set a password.

simos · February 7, 2024, 1:23pm

To check if a VM is ready, you can also

$ incus start vm1
$ incus exec vm1 -- systemctl is-system-running
starting
$ incus exec vm1 -- systemctl is-system-running
running
$ incus exec vm1 -- systemctl is-system-running
running
$

l33tname · February 7, 2024, 3:54pm

incus exec "$CONTAINER_ID" -- sh -c "systemctl is-system-running | grep -qE 'running|degraded'"

That is almost what i did, but we added degraded as ok state since there is a surprising amount of images with services which are non essential but are started and fail.

cjwatson · March 1, 2024, 1:06pm

I have the same issue here - incus launch --vm --console images:debian/bookworm/cloud shows [FAILED] Failed to start incus-agent.service - Incus - agent. on the console. Ubuntu noble host, incus 0.6-1 from noble, non-VMs work fine.

I’d be happy to try to map the block device of the VM as you suggest, but I can’t even find it (perhaps because my storage pool is on ZFS?). Is there any documentation of doing this sort of thing? At the moment I’m a bit stuck with no obvious debugging options.

stgraber · March 1, 2024, 1:41pm

Got your environment reproduced thanks to nested virtualization
Managed to reproduce the same agent startup failure too.

The main trick to debug this is:

root@noble:~# incus stop -f foo
root@noble:~# zfs set volmode=full default/virtual-machines/foo.block
root@noble:~# mount /dev/zvol/default/virtual-machines/foo.block-part2 /mnt/
root@noble:~# chroot /mnt passwd root
New password: 
Retype new password: 
passwd: password updated successfully
root@noble:~# umount /mnt
root@noble:~# incus start foo

At which point you can do a normal console login and see what’s going on.
In this case the error shows /run/incus_agent/incus-agent no such file or directory.

Mounting the config drive directly with mount -t 9p config /mnt inside the VM sure enough shows no incus-agent in there.

The incus log also confirms it:

time="2024-03-01T13:35:58Z" level=warning msg="incus-agent not found, skipping its inclusion in the VM config drive" err="<nil>" instance=foo instanceType=virtual-machine project=default

That means that the incusd process couldn’t find incus-agent in its PATH and INCUS_AGENT_PATH also isn’t set to a directory containing all the agent builds.

In the Debian package, incus-agent is located in /usr/libexec/incus/incus-agent.
The PATH variable for the incusd binary is PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin.

So that explains that.

What’s odd though is:

root@noble:~# systemctl cat incus
# /usr/lib/systemd/system/incus.service
[Unit]
Description=Incus - Main daemon
After=network-online.target openvswitch-switch.service lxcfs.service incus.socket
Requires=network-online.target lxcfs.service incus.socket
Documentation=man:incusd(1)

[Service]
EnvironmentFile=-/etc/environment
Environment=PATH=/usr/libexec/incus:/usr/sbin:/usr/bin:/sbin:/bin
ExecStartPre=/usr/libexec/incus/incus-apparmor-load
ExecStartPre=/bin/mkdir -p /var/log/incus/
ExecStartPre=/bin/chown -R root:incus-admin /var/log/incus/
ExecStart=/usr/libexec/incus/incusd --group incus-admin --logfile=/var/log/incus/incus.log
ExecStartPost=/usr/bin/incus admin waitready --timeout=600
KillMode=process
TimeoutStartSec=600s
TimeoutStopSec=30s
Restart=on-failure
LimitNOFILE=1048576
LimitNPROC=infinity
TasksMax=infinity

[Install]
Also=incus-startup.service incus.socket

As that certainly does have the needed PATH entry.

stgraber · March 1, 2024, 1:44pm

A dirty workaround until someone figures out what’s up with systemd:

root@noble:~# ln -s /usr/libexec/incus/incus-agent /usr/sbin/
root@noble:~# incus restart foo
root@noble:~# incus exec foo bash
root@distrobuilder-34fb80b9-0939-40ea-ae13-2455f4e611cf:~# 
exit
root@noble:~#

cjwatson · March 1, 2024, 2:09pm

From systemd.exec(5) under EnvironmentFile=: “Settings from these files override settings made with Environment=.” And /etc/environment sets PATH.

stgraber · March 1, 2024, 2:10pm

Ah, then that’s what’s going on…

That’s a bit annoying, they don’t happen to have a syntax to set Env variables AFTER sourcing the file?

cjwatson · March 1, 2024, 2:11pm

I was just trying to figure that out. Maybe the only option is to change ExecStart to something involving /usr/bin/env?

stgraber · March 1, 2024, 2:12pm

Yeah, in my own packages, I actually use an intermediary script for that which does:

#!/bin/sh
set -e

export INCUS_DOCUMENTATION=/opt/incus/doc/
export INCUS_LXC_HOOK=/opt/incus/share/lxc/hooks/
export INCUS_LXC_TEMPLATE_CONFIG=/opt/incus/share/lxc/config/
export INCUS_AGENT_PATH=/opt/incus/agent/
export INCUS_OVMF_PATH=/opt/incus/share/qemu/
export INCUS_UI=/opt/incus/ui/
export LD_LIBRARY_PATH=/opt/incus/lib/
export PATH=/opt/incus/bin/:${PATH}

exec incusd "$@"

Which then turns the systemd unit into:

stgraber@dakara:~$ systemctl cat incus
# /etc/systemd/system/incus.service
[Unit]
Description=Incus - Daemon
After=network-online.target openvswitch-switch.service incus-lxcfs.service incus.socket
Requires=network-online.target incus-lxcfs.service incus.socket

[Service]
EnvironmentFile=-/etc/environment
EnvironmentFile=-/etc/default/incus
ExecStart=/opt/incus/lib/systemd/incusd --group incus-admin $INCUS_OPTS --logfile /var/log/incus/incusd.log
ExecStartPost=/opt/incus/lib/systemd/incusd waitready --timeout=600
KillMode=process
TimeoutStartSec=600s
TimeoutStopSec=30s
Restart=on-failure
Delegate=yes
LimitNOFILE=1048576
LimitNPROC=infinity
TasksMax=infinity

[Install]
Also=incus-startup.service incus.socket

cjwatson · March 1, 2024, 2:16pm

Yeah, I was just thinking along the same lines. As long as you can inject the environment using EnvironmentFile= rather than Environment=, then it all works fine.

In that case I’ll take this to the Debian maintainers, since it seems that the problem is in their packaging.

cjwatson · March 1, 2024, 2:33pm

https://bugs.debian.org/1065174

dashohoxha · March 12, 2024, 10:07am

I am installing a VM from a Debian12 iso (actually it is DebianEdu 12).
After this, I am trying to install incus-agent inside the VM.
I am using the script from this snippet:

which is based on these instructions (renaming lxd to incus):

The script: install-incus-agent-in-debian.sh

#!/bin/bash -x

# See: https://discuss.linuxcontainers.org/t/install-lxd-agent-manually-on-custom-os/11826

cat > /lib/systemd/system/incus-agent-9p.service << EOF
[Unit]
Description=Incus - agent - 9p mount
Documentation=https://linuxcontainers.org/incus
ConditionPathExists=/dev/virtio-ports/org.linuxcontainers.incus
After=local-fs.target
DefaultDependencies=no

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=-/sbin/modprobe 9pnet_virtio
ExecStartPre=/bin/mkdir -p /run/incus_config/9p
ExecStartPre=/bin/chmod 0700 /run/incus_config/
ExecStart=/bin/mount -t 9p config /run/incus_config/9p -o access=0,trans=virtio

[Install]
WantedBy=multi-user.target
EOF

cat > /lib/systemd/system/incus-agent.service << EOF
[Unit]
Description=Incus - agent
Documentation=https://linuxcontainers.org/incus
ConditionPathExists=/dev/virtio-ports/org.linuxcontainers.incus
Requires=incus-agent-9p.service
After=incus-agent-9p.service
Before=cloud-init.target cloud-init.service cloud-init-local.service
DefaultDependencies=no

[Service]
Type=simple
WorkingDirectory=/run/incus_config/9p
ExecStart=/run/incus_config/9p/incus-agent
Restart=on-failure
RestartSec=5s
StartLimitInterval=60
StartLimitBurst=10

[Install]
WantedBy=multi-user.target
EOF

systemctl enable incus-agent-9p.service
systemctl enable incus-agent.service

systemctl start incus-agent-9p.service
systemctl start incus-agent.service

For convenience, the script can also be downloaded with:

wget https://t.ly/EOxxX -O install-incus-agent.sh

The problem is that the service incus-agent fails to start (the service incus-agent-9p is running). When I try to execute manually the command: /run/incus_config/9p/incus-agent I get the error message:

bash: /run/incus_agent/incus-agent: Permission denied
# or
/run/incus_config/9p/incus-agent: 30:  exec: /run/incus_agent/incus-agent: Permission denied

This is a different issue (I guess) from what is being discussed here, but maybe they are related. Anyway, I am not a systemd expert and need some help on fixing this.

stgraber · March 12, 2024, 5:19pm

The proper way to install the agent in a VM that doesn’t have it is:

mount -t 9p config /mnt
cd /mnt
./install.sh
reboot

stgraber · March 12, 2024, 6:45pm

Doh, typo in the instructions, I’ve updated the post now, the drive you want is config not agent.

dashohoxha · March 12, 2024, 6:49pm

Thanks a lot! It worked!

There was an error like this, but it worked:

./install.sh: 20: semanage: not found

stgraber · March 12, 2024, 6:57pm

Ah, interesting, you’re on a system with getenforce but no semanage, I’ll send a tweak to silence it when that happens.

Brendan_Jackman · April 11, 2025, 8:35am

I still have a similar issue with Bookworm (image fingerprint d53e971cc5d7, uploaded 2025/04/11). I’m running a Debian Testing host with Incus version 6.0.3. I’m a bit stumped for how to diagnose the issue, where is the guest’s disk? incus storage list only shows the default volume which I think is the config item with the install.sh on it. (I am using the default incus admin init so IIUC it would be a BTRFS volume, but I’m totally ignorant about this topic).

I also tried starting a NixOS guest, the Incus agent doesn’t start there either. In that case there’s a passwordless root login available on the console and this shows up:

[    5.325962] (us-agent)[428]: incus-agent.service: Failed at step EXEC spawning /run/incus_agent/incus-agent: No such file or directory

Could that be a similar problem? (install.sh doesn’t work, but dunno if it would be expected tro work on NixOS anyway:

[root@nixos:~]# mount -t 9p config /mnt
[root@nixos:~]# cd /mnt
[root@nixos:/mnt]# ./install.sh 
This script must be run from within the 9p mount

)

I also tried fedora/11 but that doesn’t boot, something goes wrong in PXE, I can send a report for that separately if that’s helpful.