Incus agent fails to start

incus launch images:debian/11 --vm
incus console keen-mongoose
Booting `Debian GNU/Linux'

Loading Linux 5.10.0-27-amd64 ...
Loading initial ramdisk ...
[FAILED] Failed to start Incus - agent.

Is there a way to see why the agent fails?

I tested a random ubuntu image which seems to work:

incus launch images:ubuntu/jammy/cloud ubuntu-jammy-vm --vm

Hmm, that exact same VM works fine here, what’s your host?

It’s possible to do some debugging by stopping the VM, map the block device of the VM, mount the second partition from that, set a root password, unmount and start the VM, at which point you can login locally fine.

1 Like

We used incus exec NAME -- sh -c 'systemctl isolate multi-user.target' as check if the vm is ready.
Which seems to be the issue. But good to know that i could set a password.

To check if a VM is ready, you can also

$ incus start vm1
$ incus exec vm1 -- systemctl is-system-running
starting
$ incus exec vm1 -- systemctl is-system-running
running
$ incus exec vm1 -- systemctl is-system-running
running
$ 
incus exec "$CONTAINER_ID" -- sh -c "systemctl is-system-running | grep -qE 'running|degraded'"

That is almost what i did, but we added degraded as ok state since there is a surprising amount of images with services which are non essential but are started and fail. :sweat_smile:

I have the same issue here - incus launch --vm --console images:debian/bookworm/cloud shows [FAILED] Failed to start incus-agent.service - Incus - agent. on the console. Ubuntu noble host, incus 0.6-1 from noble, non-VMs work fine.

I’d be happy to try to map the block device of the VM as you suggest, but I can’t even find it (perhaps because my storage pool is on ZFS?). Is there any documentation of doing this sort of thing? At the moment I’m a bit stuck with no obvious debugging options.

Got your environment reproduced thanks to nested virtualization :slight_smile:
Managed to reproduce the same agent startup failure too.

The main trick to debug this is:

root@noble:~# incus stop -f foo
root@noble:~# zfs set volmode=full default/virtual-machines/foo.block
root@noble:~# mount /dev/zvol/default/virtual-machines/foo.block-part2 /mnt/
root@noble:~# chroot /mnt passwd root
New password: 
Retype new password: 
passwd: password updated successfully
root@noble:~# umount /mnt
root@noble:~# incus start foo

At which point you can do a normal console login and see what’s going on.
In this case the error shows /run/incus_agent/incus-agent no such file or directory.

Mounting the config drive directly with mount -t 9p config /mnt inside the VM sure enough shows no incus-agent in there.

The incus log also confirms it:

time="2024-03-01T13:35:58Z" level=warning msg="incus-agent not found, skipping its inclusion in the VM config drive" err="<nil>" instance=foo instanceType=virtual-machine project=default

That means that the incusd process couldn’t find incus-agent in its PATH and INCUS_AGENT_PATH also isn’t set to a directory containing all the agent builds.

In the Debian package, incus-agent is located in /usr/libexec/incus/incus-agent.
The PATH variable for the incusd binary is PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin.

So that explains that.

What’s odd though is:

root@noble:~# systemctl cat incus
# /usr/lib/systemd/system/incus.service
[Unit]
Description=Incus - Main daemon
After=network-online.target openvswitch-switch.service lxcfs.service incus.socket
Requires=network-online.target lxcfs.service incus.socket
Documentation=man:incusd(1)

[Service]
EnvironmentFile=-/etc/environment
Environment=PATH=/usr/libexec/incus:/usr/sbin:/usr/bin:/sbin:/bin
ExecStartPre=/usr/libexec/incus/incus-apparmor-load
ExecStartPre=/bin/mkdir -p /var/log/incus/
ExecStartPre=/bin/chown -R root:incus-admin /var/log/incus/
ExecStart=/usr/libexec/incus/incusd --group incus-admin --logfile=/var/log/incus/incus.log
ExecStartPost=/usr/bin/incus admin waitready --timeout=600
KillMode=process
TimeoutStartSec=600s
TimeoutStopSec=30s
Restart=on-failure
LimitNOFILE=1048576
LimitNPROC=infinity
TasksMax=infinity

[Install]
Also=incus-startup.service incus.socket

As that certainly does have the needed PATH entry.

A dirty workaround until someone figures out what’s up with systemd:

root@noble:~# ln -s /usr/libexec/incus/incus-agent /usr/sbin/
root@noble:~# incus restart foo
root@noble:~# incus exec foo bash
root@distrobuilder-34fb80b9-0939-40ea-ae13-2455f4e611cf:~# 
exit
root@noble:~# 

From systemd.exec(5) under EnvironmentFile=: “Settings from these files override settings made with Environment=.” And /etc/environment sets PATH.

Ah, then that’s what’s going on…

That’s a bit annoying, they don’t happen to have a syntax to set Env variables AFTER sourcing the file?

I was just trying to figure that out. Maybe the only option is to change ExecStart to something involving /usr/bin/env?

Yeah, in my own packages, I actually use an intermediary script for that which does:

#!/bin/sh
set -e

export INCUS_DOCUMENTATION=/opt/incus/doc/
export INCUS_LXC_HOOK=/opt/incus/share/lxc/hooks/
export INCUS_LXC_TEMPLATE_CONFIG=/opt/incus/share/lxc/config/
export INCUS_AGENT_PATH=/opt/incus/agent/
export INCUS_OVMF_PATH=/opt/incus/share/qemu/
export INCUS_UI=/opt/incus/ui/
export LD_LIBRARY_PATH=/opt/incus/lib/
export PATH=/opt/incus/bin/:${PATH}

exec incusd "$@"

Which then turns the systemd unit into:

stgraber@dakara:~$ systemctl cat incus
# /etc/systemd/system/incus.service
[Unit]
Description=Incus - Daemon
After=network-online.target openvswitch-switch.service incus-lxcfs.service incus.socket
Requires=network-online.target incus-lxcfs.service incus.socket

[Service]
EnvironmentFile=-/etc/environment
EnvironmentFile=-/etc/default/incus
ExecStart=/opt/incus/lib/systemd/incusd --group incus-admin $INCUS_OPTS --logfile /var/log/incus/incusd.log
ExecStartPost=/opt/incus/lib/systemd/incusd waitready --timeout=600
KillMode=process
TimeoutStartSec=600s
TimeoutStopSec=30s
Restart=on-failure
Delegate=yes
LimitNOFILE=1048576
LimitNPROC=infinity
TasksMax=infinity

[Install]
Also=incus-startup.service incus.socket

Yeah, I was just thinking along the same lines. As long as you can inject the environment using EnvironmentFile= rather than Environment=, then it all works fine.

In that case I’ll take this to the Debian maintainers, since it seems that the problem is in their packaging.

https://bugs.debian.org/1065174

I am installing a VM from a Debian12 iso (actually it is DebianEdu 12).
After this, I am trying to install incus-agent inside the VM.
I am using the script from this snippet:

which is based on these instructions (renaming lxd to incus):

The script: install-incus-agent-in-debian.sh
#!/bin/bash -x

# See: https://discuss.linuxcontainers.org/t/install-lxd-agent-manually-on-custom-os/11826

cat > /lib/systemd/system/incus-agent-9p.service << EOF
[Unit]
Description=Incus - agent - 9p mount
Documentation=https://linuxcontainers.org/incus
ConditionPathExists=/dev/virtio-ports/org.linuxcontainers.incus
After=local-fs.target
DefaultDependencies=no

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=-/sbin/modprobe 9pnet_virtio
ExecStartPre=/bin/mkdir -p /run/incus_config/9p
ExecStartPre=/bin/chmod 0700 /run/incus_config/
ExecStart=/bin/mount -t 9p config /run/incus_config/9p -o access=0,trans=virtio

[Install]
WantedBy=multi-user.target
EOF

cat > /lib/systemd/system/incus-agent.service << EOF
[Unit]
Description=Incus - agent
Documentation=https://linuxcontainers.org/incus
ConditionPathExists=/dev/virtio-ports/org.linuxcontainers.incus
Requires=incus-agent-9p.service
After=incus-agent-9p.service
Before=cloud-init.target cloud-init.service cloud-init-local.service
DefaultDependencies=no

[Service]
Type=simple
WorkingDirectory=/run/incus_config/9p
ExecStart=/run/incus_config/9p/incus-agent
Restart=on-failure
RestartSec=5s
StartLimitInterval=60
StartLimitBurst=10

[Install]
WantedBy=multi-user.target
EOF

systemctl enable incus-agent-9p.service
systemctl enable incus-agent.service

systemctl start incus-agent-9p.service
systemctl start incus-agent.service

For convenience, the script can also be downloaded with:

wget https://t.ly/EOxxX -O install-incus-agent.sh

The problem is that the service incus-agent fails to start (the service incus-agent-9p is running). When I try to execute manually the command: /run/incus_config/9p/incus-agent I get the error message:

bash: /run/incus_agent/incus-agent: Permission denied
# or
/run/incus_config/9p/incus-agent: 30:  exec: /run/incus_agent/incus-agent: Permission denied

This is a different issue (I guess) from what is being discussed here, but maybe they are related. Anyway, I am not a systemd expert and need some help on fixing this.

The proper way to install the agent in a VM that doesn’t have it is:

mount -t 9p config /mnt
cd /mnt
./install.sh
reboot
2 Likes

Doh, typo in the instructions, I’ve updated the post now, the drive you want is config not agent.

Thanks a lot! It worked!

There was an error like this, but it worked:

./install.sh: 20: semanage: not found

Ah, interesting, you’re on a system with getenforce but no semanage, I’ll send a tweak to silence it when that happens.