Instance Startup Hangs Sliently

Hello nice people of the forums,

I have a single server deployment of incus server running on debian-testing, running around 64 containers and 10 virtrual-machines.
I have noticed that the incus daemon sometime enters a broken state where containers fail to startup, and attempts to start a container fails silently.

I used the container mding-rocky8-test as a example for notes below. This issue is reliably reproduciable to all containers

The issue:

  • Instances startup just fine when server is recently rebooted
  • After sometime, when I attempt to restart any container, the restart command hangs forever, and container would be stuck in “Stopped” state
  • Future attempt to restart the container also hangs indefinately
  • There is no logs I can find that suggest there is any errors
  • This happens to all containers I attempt to restart, once symptoms appear, they do not go away until reboot

If I reboot the server, instance startups seem to work again for sometime, until they break after some unknown event. Symptons do not exibit until I attempt to restart a container which is not often.
So far this have happened 3 times over the course of a few weeks.

Container setup:

  • Instances use ZFS storage backend, and have a root filesystem on the local ZFS pool (ZFS pool consist of 3 vdev of mirrored U.2 SSDs)
  • The host mounts some NFS shares from a file server, and each instances also mount those folders
  • Containers are snapshotted daily, and snapshot send to another ZFS server with syncnoid (some extra ZFS snapshots do get created)

Here is the config of the example container, but I am not certain this is the source of the problems as all containers fail.

$ incus config show mding-rocky8-test --expanded
architecture: x86_64
config:
  cloud-init.user-data: |
    # Some cloud init scripts
  image.architecture: amd64
  image.description: Rockylinux 8 amd64 (20240822_02:06)
  image.os: Rockylinux
  image.release: "8"
  image.requirements.cdrom_agent: "true"
  image.serial: "20240822_02:06"
  image.type: squashfs
  image.variant: cloud
  raw.idmap: |
    uid 1012 1012
    gid 500 500
  volatile.base_image: d98a289af8fa46a231f02a870d18e4f8788c4fd434be1cd900a7daf54549c9a0
  volatile.cloud-init.instance-id: 1d9961f0-3f6a-49f6-8e0e-01cc0ce17af0
  volatile.eth0.hwaddr: 00:16:3e:3a:8a:c6
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1012},{"Isuid":true,"Isgid":false,"Hostid":1012,"Nsid":1012,"Maprange":1},{"Isuid":true,"Isgid":false,"Hostid":1001013,"Nsid":1013,"Maprange":999998987},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":500},{"Isuid":false,"Isgid":true,"Hostid":500,"Nsid":500,"Maprange":1},{"Isuid":false,"Isgid":true,"Hostid":1000501,"Nsid":501,"Maprange":999999499}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1012},{"Isuid":true,"Isgid":false,"Hostid":1012,"Nsid":1012,"Maprange":1},{"Isuid":true,"Isgid":false,"Hostid":1001013,"Nsid":1013,"Maprange":999998987},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":500},{"Isuid":false,"Isgid":true,"Hostid":500,"Nsid":500,"Maprange":1},{"Isuid":false,"Isgid":true,"Hostid":1000501,"Nsid":501,"Maprange":999999499}]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: STOPPED
  volatile.last_state.ready: "false"
  volatile.uuid: 0a3c4797-e51b-4cb5-8899-d31bfb03c850
  volatile.uuid.generation: 0a3c4797-e51b-4cb5-8899-d31bfb03c850
devices:
  eth0:
    name: eth0
    network: macvlan1g
    type: nic
  nfs_folder_1:
    path: /work/nfs_folder_1
    source: /mnt/nfs_folder_1
    type: disk
  nfs_folder_2:
    path: /work/nfs_folder_2
    source: /mnt/nfs_folder_2
    type: disk
  root:
    path: /
    pool: mainzfs
    type: disk
ephemeral: false
profiles:
- default
- rocky-cloud
- uid-mding
stateful: false
description: ""

Network Config for the macvlan1g interface:

incus network show macvlan1g
config:
  parent: enp33s0f0np0
description: ""
name: macvlan1g
type: macvlan
used_by:
  - many containers and VMs, ~64
managed: true
status: Created
locations:
- none

Incus and lxc versions:

Packages are installed from apt, list here from apt list --installed

liblxc-common/now 1:5.0.3-2+b1 amd64 [installed,upgradable to: 1:6.0.1-1]
liblxc1/now 1:5.0.3-2+b1 amd64 [installed,local]a
lxcfs/testing,now 6.0.1-1 amd64 [installed,automatic]
incus-agent/testing,now 6.0.1-3 amd64 [installed,automatic]
incus-client/now 0.6-1 amd64 [installed,upgradable to: 6.0.1-3]
incus/now 0.6-1 amd64 [installed,upgradable to: 6.0.1-3]

Machine Info:

OS: Debian GNU/Linux trixie/sid x86_64 
Host: AS -2025HS-TNR 0123456789 
Kernel: 6.7.12-amd64 
Uptime: 7 days, 13 hours, 48 mins 
Packages: 1011 (dpkg) 
Shell: bash 5.2.32 
Resolution: 1024x768 
Terminal: /dev/pts/5 
CPU: AMD EPYC 9654 (384) @ 2.400GHz 
GPU: 6a:00.0 ASPEED Technology, Inc. ASPEED Graphics Family 
Memory: 230584MiB / 1547673MiB 

What I have tried:
Following the incus documentation pages, I have tried to get logs for containers in question, there is nothing

$ incus info --show-log mding-rocky8-test 
Name: mding-rocky8-test
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2024/08/22 13:54 EDT
Last Used: 2024/08/22 13:54 EDT

Log:

Started incus daemon with -v flag, and there are the outputs in incusd logs:

time="2024-08-29T09:55:08-04:00" level=info msg="Restarting instance" action=shutdown created="2024-08-22 17:54:26.172343342 +0000 UTC" ephemeral=false instance=mding-rocky8-test instanceType=container project=default timeout=10m0s used="2024-08-22 17:54:53.618093707 +0000 UTC"
time="2024-08-29T10:12:50-04:00" level=info msg="Updating images"
time="2024-08-29T10:12:50-04:00" level=info msg="Pruning expired backups"
time="2024-08-29T10:12:50-04:00" level=info msg="Done pruning expired backups"
time="2024-08-29T10:12:50-04:00" level=info msg="Done updating images"
time="2024-08-29T10:13:17-04:00" level=info msg="Starting instance" action=start created="2024-08-22 17:54:26.172343342 +0000 UTC" ephemeral=false instance=mding-rocky8-test instanceType=container project=default stateful=false used="2024-08-22 17:54:53.618093707 +0000 UTC"

This shows my attempt to restart the container (which eventually timed out), and my attempts to start that same container later (also hangs forever / times out)

dmesg output since the restart attempt:

[Thu Aug 29 09:55:06 2024] physsOl4i3: renamed from eth0
[Thu Aug 29 09:55:06 2024] mac7c27d770: renamed from physsOl4i3
[Thu Aug 29 10:00:37 2024] veth9f9886e: renamed from eth0
[Thu Aug 29 10:00:38 2024] br-072ced171cf1: port 1(vethaadb323) entered disabled state
[Thu Aug 29 10:00:38 2024] br-072ced171cf1: port 1(vethaadb323) entered disabled state
[Thu Aug 29 10:00:38 2024] vethaadb323 (unregistering): left allmulticast mode
[Thu Aug 29 10:00:38 2024] vethaadb323 (unregistering): left promiscuous mode
[Thu Aug 29 10:00:38 2024] br-072ced171cf1: port 1(vethaadb323) entered disabled state
[Thu Aug 29 10:01:00 2024] br-fb7ed19e0779: port 1(vetha1804d3) entered blocking state
[Thu Aug 29 10:01:00 2024] br-fb7ed19e0779: port 1(vetha1804d3) entered disabled state
[Thu Aug 29 10:01:00 2024] vetha1804d3: entered allmulticast mode
[Thu Aug 29 10:01:00 2024] vetha1804d3: entered promiscuous mode
[Thu Aug 29 10:01:00 2024] br-fb7ed19e0779: port 1(vetha1804d3) entered blocking state
[Thu Aug 29 10:01:00 2024] br-fb7ed19e0779: port 1(vetha1804d3) entered forwarding state
[Thu Aug 29 10:01:00 2024] br-fb7ed19e0779: port 1(vetha1804d3) entered disabled state
[Thu Aug 29 10:01:01 2024] eth0: renamed from veth554bdd5
[Thu Aug 29 10:01:01 2024] br-fb7ed19e0779: port 1(vetha1804d3) entered blocking state
[Thu Aug 29 10:01:01 2024] br-fb7ed19e0779: port 1(vetha1804d3) entered forwarding state

Attempting to start my broken container with --debug flag yields this. This reads like the issue is not with the incus client, but the daemon side eating the command and getting stuck.

More stuff above here that got clipped off by my terminal. Piping the output here into a file breaks formatting
...

DEBUG  [2024-08-29T10:41:26-04:00] Sending request to Incus                      etag= method=GET url="http://unix.socket/1.0/instances/mding-rocky8-test"
DEBUG  [2024-08-29T10:41:26-04:00] Got response struct from Incus               
DEBUG  [2024-08-29T10:41:26-04:00] 
	{
		# Just incus config output, removed for length
	} 
DEBUG  [2024-08-29T10:41:26-04:00] Connected to the websocket: ws://unix.socket/1.0/events 
DEBUG  [2024-08-29T10:41:26-04:00] Sending request to Incus                      etag= method=PUT url="http://unix.socket/1.0/instances/mding-rocky8-test/state"
DEBUG  [2024-08-29T10:41:26-04:00] 
	{
		"action": "start",
		"timeout": 0,
		"force": false,
		"stateful": false
	} 
DEBUG  [2024-08-29T10:41:26-04:00] Got operation from Incus                     
DEBUG  [2024-08-29T10:41:26-04:00] 
	{
		"id": "337ede79-25a9-4ec8-bef4-a24b1edda3d1",
		"class": "task",
		"description": "Starting instance",
		"created_at": "2024-08-29T10:41:26.317638517-04:00",
		"updated_at": "2024-08-29T10:41:26.317638517-04:00",
		"status": "Running",
		"status_code": 103,
		"resources": {
			"instances": [
				"/1.0/instances/mding-rocky8-test"
			]
		},
		"metadata": null,
		"may_cancel": false,
		"err": "",
		"location": "none"
	} 
DEBUG  [2024-08-29T10:41:26-04:00] Sending request to Incus                      etag= method=GET url="http://unix.socket/1.0/operations/337ede79-25a9-4ec8-bef4-a24b1edda3d1"
DEBUG  [2024-08-29T10:41:26-04:00] Got response struct from Incus               
DEBUG  [2024-08-29T10:41:26-04:00] 
	{
		"id": "337ede79-25a9-4ec8-bef4-a24b1edda3d1",
		"class": "task",
		"description": "Starting instance",
		"created_at": "2024-08-29T10:41:26.317638517-04:00",
		"updated_at": "2024-08-29T10:41:26.317638517-04:00",
		"status": "Running",
		"status_code": 103,
		"resources": {
			"instances": [
				"/1.0/instances/mding-rocky8-test"
			]
		},
		"metadata": null,
		"may_cancel": false,
		"err": "",
		"location": "none"
	} 

	
No more output after this, hangs forever

I have also tried restarting the incus daemon, and it typically takes a long time (it will time out when trying to stop).
After restarting incusd, all the VMs show up in ERROR state (but they are still running), but instance startup still does not work.

Journctl Logs when attempting to restart incus from journalctl -u incus

Aug 28 17:10:46 lxc01 systemd[1]: Stopping incus.service - Incus - Main daemon...
Aug 28 17:11:16 lxc01 systemd[1]: incus.service: State 'stop-sigterm' timed out. Killing.
Aug 28 17:11:16 lxc01 systemd[1]: incus.service: Killing process 601244 (incusd) with signal SIGKILL.
Aug 28 17:11:47 lxc01 systemd[1]: incus.service: Processes still around after SIGKILL. Ignoring.
Aug 28 17:12:17 lxc01 systemd[1]: incus.service: State 'final-sigterm' timed out. Killing.
Aug 28 17:12:17 lxc01 systemd[1]: incus.service: Killing process 601244 (incusd) with signal SIGKILL.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Processes still around after final SIGKILL. Entering failed mode.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Failed with result 'timeout'.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10046 (virtfs-proxy-he) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10051 (virtfs-proxy-he) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10055 (virtfs-proxy-he) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10060 (virtfs-proxy-he) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10207 (qemu-system-x86) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10480 (virtfs-proxy-he) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10484 (virtfs-proxy-he) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10486 (virtfs-proxy-he) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10488 (virtfs-proxy-he) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10510 (qemu-system-x86) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10659 (qemu-system-x86) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 10930 (qemu-system-x86) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 11336 (qemu-system-x86) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 11725 (qemu-system-x86) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 11927 (qemu-system-x86) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 1360171 (incusd) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 1372136 (incusd) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 753218 (incusd) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 3753555 (qemu-system-x86) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 949012 (incusd) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 5017 (incusd) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Unit process 601244 (incusd) remains running after unit stopped.
Aug 28 17:12:47 lxc01 systemd[1]: Stopped incus.service - Incus - Main daemon.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Consumed 23h 14min 19.645s CPU time, 14.9G memory peak.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10046 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10051 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10055 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10060 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10207 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10480 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10484 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10486 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10488 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10510 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10659 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10930 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 11336 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 11725 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 11927 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 1360171 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 1372136 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 753218 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 3753555 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 949012 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 5017 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 601244 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: Starting incus.service - Incus - Main daemon...
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10046 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10051 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10055 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10060 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10207 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10480 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10484 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10486 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10488 (virtfs-proxy-he) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10510 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10659 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 10930 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 11336 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 11725 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 11927 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 1360171 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 1372136 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 753218 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 3753555 (qemu-system-x86) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 949012 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 5017 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: Found left-over process 601244 (incusd) in control group while starting unit. Ignoring.
Aug 28 17:12:47 lxc01 systemd[1]: incus.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg="Starting up" mode=normal path=/var/lib/incus version=0.6
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg="System idmap (root user):"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1000 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1001 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1002 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1003 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1004 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1005 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1006 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1007 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1008 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1009 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1010 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1011 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1012 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1013 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1014 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1015 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1016 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1017 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1018 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1019 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1020 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1021 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1022 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1023 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1024 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1025 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1026 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1027 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1028 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1029 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1030 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1031 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1032 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1033 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1034 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1035 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1036 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1037 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1038 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1039 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1040 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1041 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1042 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1043 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1044 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1045 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1046 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1047 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1048 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1049 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1050 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1051 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1052 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1053 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1054 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1055 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1056 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1057 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1058 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1059 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1060 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 3000 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 5000 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 5001 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1000000 1000000000"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - g 0 500 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - g 0 1000 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - g 0 3000 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - g 0 5001 1"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - g 0 1000000 1000000000"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg="Selected idmap:"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - u 0 1000000 1000000000"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - g 0 1000000 1000000000"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg="Kernel features:"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - closing multiple file descriptors efficiently: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - netnsid-based network retrieval: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - pidfds: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - core scheduling: no"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - uevent injection: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - seccomp listener: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - seccomp listener continue syscalls: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - seccomp listener add file descriptors: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - attach to namespaces via pidfds: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - safe native terminal allocation: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - unprivileged binfmt_misc: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - unprivileged file capabilities: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - cgroup layout: cgroup2"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg=" - idmapped mounts kernel support: yes"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg="Instance type operational" driver=lxc features="map[]" type=container
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg="Instance type operational" driver=qemu features="map[cpu_hotplug:{} io_uring:{} vhost_net:{}]" type=virtual-machine
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg="Initializing local database"
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg="Set client certificate to server certificate" fingerprint=b570b8fc3578b669654ba86598a6095d133ebc9c18d5b4a14d745489a507a320
Aug 28 17:12:47 lxc01 incusd[861872]: time="2024-08-28T17:12:47-04:00" level=info msg="Starting database node" id=1 local=1 role=voter
Aug 28 17:12:48 lxc01 incusd[861872]: time="2024-08-28T17:12:48-04:00" level=info msg="Loading daemon configuration"
Aug 28 17:12:48 lxc01 incusd[861872]: time="2024-08-28T17:12:48-04:00" level=info msg="Binding socket" inherited=true socket=/var/lib/incus/unix.socket type="REST API Unix socket"
Aug 28 17:12:48 lxc01 incusd[861872]: time="2024-08-28T17:12:48-04:00" level=info msg="Binding socket" socket=/var/lib/incus/guestapi/sock type="devIncus socket"
Aug 28 17:12:48 lxc01 incusd[861872]: time="2024-08-28T17:12:48-04:00" level=info msg="Binding socket" socket="host(2):2936" type="VM socket"
Aug 28 17:12:48 lxc01 incusd[861872]: time="2024-08-28T17:12:48-04:00" level=info msg="Initializing global database"
Aug 28 17:12:48 lxc01 incusd[861872]: time="2024-08-28T17:12:48-04:00" level=info msg="Connecting to global database"
Aug 28 17:12:48 lxc01 incusd[861872]: time="2024-08-28T17:12:48-04:00" level=info msg="Connected to global database"
Aug 28 17:12:48 lxc01 incusd[861872]: time="2024-08-28T17:12:48-04:00" level=info msg="Initialized global database"
Aug 28 17:12:48 lxc01 incusd[861872]: time="2024-08-28T17:12:48-04:00" level=info msg="Firewall loaded driver" driver=nftables
Aug 28 17:12:48 lxc01 incusd[861872]: time="2024-08-28T17:12:48-04:00" level=info msg="Initializing storage pools"
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="Initialized storage pool" pool=mainzfs
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="All storage pools initialized"
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="Initializing daemon storage mounts"
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="Initializing networks"
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="Initialized network" name=net_auth project=default
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="Initialized network" name=macvlan1g project=default
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="All networks initialized"
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="Cleaning up leftover image files"
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="Done cleaning up leftover image files"
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="Starting device monitor"
Aug 28 17:12:49 lxc01 incusd[861872]: time="2024-08-28T17:12:49-04:00" level=info msg="Initialized filesystem monitor" driver=fanotify path=/dev
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Started seccomp handler" path=/run/incus/seccomp.socket
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Pruning expired images"
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Done pruning expired images"
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Pruning expired backups"
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Done pruning expired backups"
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Expiring log files"
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=warning msg="Failed auto start instance attempt" attempt=1 err="The instance cannot be started as in Error status" instance=freeipa-1 maxAttempts=3 project=default
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Updating images"
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Updating instance types"
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Done expiring log files"
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Pruning resolved warnings"
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Done pruning resolved warnings"
Aug 28 17:12:50 lxc01 incusd[861872]: time="2024-08-28T17:12:50-04:00" level=info msg="Done updating images"
Aug 28 17:12:51 lxc01 incusd[861872]: time="2024-08-28T17:12:51-04:00" level=info msg="Done updating instance types"
Aug 28 17:12:55 lxc01 incusd[861872]: time="2024-08-28T17:12:55-04:00" level=warning msg="Failed auto start instance attempt" attempt=2 err="The instance cannot be started as in Error status" instance=freeipa-1 maxAttempts=3 project=default
Aug 28 17:13:00 lxc01 incusd[861872]: time="2024-08-28T17:13:00-04:00" level=warning msg="Failed auto start instance attempt" attempt=3 err="The instance cannot be started as in Error status" instance=freeipa-1 maxAttempts=3 project=default
Aug 28 17:13:00 lxc01 incusd[861872]: time="2024-08-28T17:13:00-04:00" level=error msg="Failed to auto start instance" err="The instance cannot be started as in Error status" instance=freeipa-1 project=default
Aug 28 17:13:00 lxc01 incusd[861872]: time="2024-08-28T17:13:00-04:00" level=warning msg="Failed auto start instance attempt" attempt=1 err="The instance cannot be started as in Error status" instance=k8s-deb-1 maxAttempts=3 project=default
Aug 28 17:13:05 lxc01 incusd[861872]: time="2024-08-28T17:13:05-04:00" level=warning msg="Failed auto start instance attempt" attempt=2 err="The instance cannot be started as in Error status" instance=k8s-deb-1 maxAttempts=3 project=default
Aug 28 17:13:10 lxc01 incusd[861872]: time="2024-08-28T17:13:10-04:00" level=warning msg="Failed auto start instance attempt" attempt=3 err="The instance cannot be started as in Error status" instance=k8s-deb-1 maxAttempts=3 project=default
Aug 28 17:13:10 lxc01 incusd[861872]: time="2024-08-28T17:13:10-04:00" level=error msg="Failed to auto start instance" err="The instance cannot be started as in Error status" instance=k8s-deb-1 project=default
Aug 28 17:13:10 lxc01 incusd[861872]: time="2024-08-28T17:13:10-04:00" level=warning msg="Failed auto start instance attempt" attempt=1 err="The instance cannot be started as in Error status" instance=k8s-deb-2 maxAttempts=3 project=default
Aug 28 17:13:15 lxc01 incusd[861872]: time="2024-08-28T17:13:15-04:00" level=warning msg="Failed auto start instance attempt" attempt=2 err="The instance cannot be started as in Error status" instance=k8s-deb-2 maxAttempts=3 project=default
Aug 28 17:13:20 lxc01 incusd[861872]: time="2024-08-28T17:13:20-04:00" level=warning msg="Failed auto start instance attempt" attempt=3 err="The instance cannot be started as in Error status" instance=k8s-deb-2 maxAttempts=3 project=default
Aug 28 17:13:20 lxc01 incusd[861872]: time="2024-08-28T17:13:20-04:00" level=error msg="Failed to auto start instance" err="The instance cannot be started as in Error status" instance=k8s-deb-2 project=default
Aug 28 17:13:20 lxc01 incusd[861872]: time="2024-08-28T17:13:20-04:00" level=warning msg="Failed auto start instance attempt" attempt=1 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-1 maxAttempts=3 project=default
Aug 28 17:13:25 lxc01 incusd[861872]: time="2024-08-28T17:13:25-04:00" level=warning msg="Failed auto start instance attempt" attempt=2 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-1 maxAttempts=3 project=default
Aug 28 17:13:30 lxc01 incusd[861872]: time="2024-08-28T17:13:30-04:00" level=warning msg="Failed auto start instance attempt" attempt=3 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-1 maxAttempts=3 project=default
Aug 28 17:13:30 lxc01 incusd[861872]: time="2024-08-28T17:13:30-04:00" level=error msg="Failed to auto start instance" err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-1 project=default
Aug 28 17:13:30 lxc01 incusd[861872]: time="2024-08-28T17:13:30-04:00" level=warning msg="Failed auto start instance attempt" attempt=1 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-2 maxAttempts=3 project=default
Aug 28 17:13:35 lxc01 incusd[861872]: time="2024-08-28T17:13:35-04:00" level=warning msg="Failed auto start instance attempt" attempt=2 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-2 maxAttempts=3 project=default
Aug 28 17:13:40 lxc01 incusd[861872]: time="2024-08-28T17:13:40-04:00" level=warning msg="Failed auto start instance attempt" attempt=3 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-2 maxAttempts=3 project=default
Aug 28 17:13:40 lxc01 incusd[861872]: time="2024-08-28T17:13:40-04:00" level=error msg="Failed to auto start instance" err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-2 project=default
Aug 28 17:13:40 lxc01 incusd[861872]: time="2024-08-28T17:13:40-04:00" level=warning msg="Failed auto start instance attempt" attempt=1 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-3 maxAttempts=3 project=default
Aug 28 17:13:45 lxc01 incusd[861872]: time="2024-08-28T17:13:45-04:00" level=warning msg="Failed auto start instance attempt" attempt=2 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-3 maxAttempts=3 project=default
Aug 28 17:13:50 lxc01 incusd[861872]: time="2024-08-28T17:13:50-04:00" level=warning msg="Failed auto start instance attempt" attempt=3 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-3 maxAttempts=3 project=default
Aug 28 17:13:50 lxc01 incusd[861872]: time="2024-08-28T17:13:50-04:00" level=error msg="Failed to auto start instance" err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-3 project=default
Aug 28 17:13:50 lxc01 incusd[861872]: time="2024-08-28T17:13:50-04:00" level=warning msg="Failed auto start instance attempt" attempt=1 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-4 maxAttempts=3 project=default
Aug 28 17:13:55 lxc01 incusd[861872]: time="2024-08-28T17:13:55-04:00" level=warning msg="Failed auto start instance attempt" attempt=2 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-4 maxAttempts=3 project=default
Aug 28 17:14:00 lxc01 incusd[861872]: time="2024-08-28T17:14:00-04:00" level=warning msg="Failed auto start instance attempt" attempt=3 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-4 maxAttempts=3 project=default
Aug 28 17:14:00 lxc01 incusd[861872]: time="2024-08-28T17:14:00-04:00" level=error msg="Failed to auto start instance" err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-4 project=default
Aug 28 17:14:00 lxc01 incusd[861872]: time="2024-08-28T17:14:00-04:00" level=warning msg="Failed auto start instance attempt" attempt=1 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-5 maxAttempts=3 project=default
Aug 28 17:14:05 lxc01 incusd[861872]: time="2024-08-28T17:14:05-04:00" level=warning msg="Failed auto start instance attempt" attempt=2 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-5 maxAttempts=3 project=default
Aug 28 17:14:10 lxc01 incusd[861872]: time="2024-08-28T17:14:10-04:00" level=warning msg="Failed auto start instance attempt" attempt=3 err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-5 maxAttempts=3 project=default
Aug 28 17:14:10 lxc01 incusd[861872]: time="2024-08-28T17:14:10-04:00" level=error msg="Failed to auto start instance" err="The instance cannot be started as in Error status" instance=k8s-rocky-vm-5 project=default
Aug 28 17:14:10 lxc01 incusd[861872]: time="2024-08-28T17:14:10-04:00" level=info msg="Daemon started"
Aug 28 17:14:10 lxc01 systemd[1]: Started incus.service - Incus - Main daemon.

I am a bit stuck given the lack of error messages from logs. I know rebooting the server fixes this temporaitly, but rebooting often is distruptive to the users and I would really like to get to the root cause of this hanging issue.

I would greately appriciate suggestions for how to invesigate further, or any insight on if this is a known issue.
I searched the forms and can’t seem to find someone who has reported the same issue.

Thanks,

Hi!

When Incus launches/(or restarts) a container or VM, it spawns the container process tree or launches qemu respectively. In a way, Incus gets out of the way somewhat soon and the instance is doing its thing. The big question is what could happen that would make Incus get stuck at launching the instance of a container/VM.

  1. Can you verify whether you can launch containers with the simplest possible profile? You just need to create a new Incus profile that specifies your storage device and then launch with incus launch images:debian/12/cloud myinstance --profile mysimpleprofile. No need even to specify something for networking. By doing this, you will somewhat figure out whether it’s an Incus issue or a system environment issue.
  2. When you launch an instance, you can use the --console parameter to view the startup progression when the runtime is doing its work. As in incus launch images:debian/12/cloud myinstance --console. If you do not see any output from the runtime, then it’s an Incus issue.
  3. You mention NFS. Are these containers configured with NFS shares in /etc/fstab? That is, it requires a fully working NFS so that the container init to perform the startup?

It’s likely you have found either a bug in Incus or something goes wrong with NFS+Macvlan after long use.

Hi Simons,
Thanks for the help, I did end up rebooting the host again to get it back to functional status. I can answer your questions here anyway:

  1. I did not try the simpliest profile possible. I will try this next time the system is hung. I created a canary container that gets rebooted every hour, im hoping that will give me an exact time when things break next time it happens.

  2. Nice I didn’t know about the --console flag on start. I have tried view console seperately with a incus console <instance_name> immidiatly after i issue the start command, but I get a “instance is not running” error. I will certainly try this next time.

  3. The containers themself are not configured to mount NFS within the container. I have the NFS mounts on the host, (with fstab entrys), and the containers mount those NFS folders with a device, like:

devices:
...
  nfs_folder_2:
    path: /work/nfs_folder_2
    source: /mnt/nfs_folder_2
    type: disk
...

We have uid map setup so containers only have access to certain UIDs on the NFS share.

Is this setup a good idea?
Is there advantages to try and mount NFS from within the container? (Which require some security settings which I need to figure out.)
I am seeing some performance issues with NFS, maybe mounting within the container individually could help.

Mounting NFS on the host and use the shares in the container works as long as the NFS share is stable. For example if the host exporting the share is rebooting or has any kind of network failure things are starting to act funny. You should check your NFS mount parameters like using TCP, ACTIMEO=0, noac, bg, etc. Defining the right parameters will improve the performance but also how clients are getting informed about issues.

Mounting NFS on instance level might be better and can offer more flexibility. This way there is no dependency during instance start from the host perspective.

Anyway as @simos says there is no real best practice and testing might be the best option to nail it down.

1 Like

We do have a occasionally flaky NFS exporter (Truenas SCALE)
Interesting point about NFS being funny, that could possibly be related.
I have quickly switched NFS off and on quickly before without casing any fatal damage. But I have no idea what happens if NFS stays broken past the 600 second timeout.

NFS parameter tuning was on the table, good to know mounting NFS per instance is something not discouraged. In that case i’m eager to try it and see if performance improves.

Thanks!