LXD 5.7 - After system update and reboot, some containers do not start (ZFS and shiftfs, with 5.15 kernel)

On an Ubuntu 20.04.5 LTS system; after some regular updates and subsequent reboot, 3 out of 10 containers will not start. They had started up fine through prior reboots in the system’s past.

The logs don’t show anything obvious as to the cause.

Container 1 log:

$ lxc info --show-log dev
Name: dev
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2021/06/19 13:05 EDT
Last Used: 2022/10/25 19:19 EDT

Snapshots:
+----------+----------------------+------------+----------+
|   NAME   |       TAKEN AT       | EXPIRES AT | STATEFUL |
+----------+----------------------+------------+----------+
| b4getssl | 2022/10/23 10:18 EDT |            | NO       |
+----------+----------------------+------------+----------+

Log:

lxc dev 20221025231908.829 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc dev 20221025231908.829 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc dev 20221025231908.829 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc dev 20221025231908.829 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc dev 20221025231908.829 WARN     cgfsng - ../src/src/lxc/cgroups/cgfsng.c:fchowmodat:1611 - No such file or directory - Failed to fchownat(42, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
lxc dev 20221025231909.328 ERROR    conf - ../src/src/lxc/conf.c:run_buffer:321 - Script exited with status 1
lxc dev 20221025231909.329 ERROR    conf - ../src/src/lxc/conf.c:lxc_setup:4400 - Failed to run mount hooks
lxc dev 20221025231909.329 ERROR    start - ../src/src/lxc/start.c:do_start:1272 - Failed to setup container "dev"
lxc dev 20221025231909.330 ERROR    sync - ../src/src/lxc/sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 4)
lxc dev 20221025231909.549 WARN     network - ../src/src/lxc/network.c:lxc_delete_network_priv:3631 - Failed to rename interface with index 0 from "eth0" to its initial name "vethacabc4b9"
lxc dev 20221025231909.550 ERROR    lxccontainer - ../src/src/lxc/lxccontainer.c:wait_on_daemonized_start:877 - Received container state "ABORTING" instead of "RUNNING"
lxc dev 20221025231909.551 ERROR    start - ../src/src/lxc/start.c:__lxc_start:2107 - Failed to spawn container "dev"
lxc dev 20221025231909.552 WARN     start - ../src/src/lxc/start.c:lxc_abort:1036 - No such process - Failed to send SIGKILL via pidfd 43 for process 154882
lxc dev 20221025231914.373 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc dev 20221025231914.373 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 20221025231914.403 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20221025231914.403 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20221025231914.403 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"
lxc 20221025231914.403 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"

Container 2 log:

$ lxc info --show-log junk
Name: junk
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2021/11/20 21:40 EST
Last Used: 2022/10/25 18:10 EDT

Log:

lxc junk 20221025221042.966 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc junk 20221025221042.966 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc junk 20221025221042.967 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc junk 20221025221042.967 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc junk 20221025221042.967 WARN     cgfsng - ../src/src/lxc/cgroups/cgfsng.c:fchowmodat:1611 - No such file or directory - Failed to fchownat(42, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
lxc junk 20221025221043.201 ERROR    conf - ../src/src/lxc/conf.c:run_buffer:321 - Script exited with status 1
lxc junk 20221025221043.201 ERROR    conf - ../src/src/lxc/conf.c:lxc_setup:4400 - Failed to run mount hooks
lxc junk 20221025221043.201 ERROR    start - ../src/src/lxc/start.c:do_start:1272 - Failed to setup container "junk"
lxc junk 20221025221043.201 ERROR    sync - ../src/src/lxc/sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 4)
lxc junk 20221025221043.223 WARN     network - ../src/src/lxc/network.c:lxc_delete_network_priv:3631 - Failed to rename interface with index 0 from "eth0" to its initial name "vethe2ba5162"
lxc junk 20221025221043.223 ERROR    lxccontainer - ../src/src/lxc/lxccontainer.c:wait_on_daemonized_start:877 - Received container state "ABORTING" instead of "RUNNING"
lxc junk 20221025221043.223 ERROR    start - ../src/src/lxc/start.c:__lxc_start:2107 - Failed to spawn container "junk"
lxc junk 20221025221043.223 WARN     start - ../src/src/lxc/start.c:lxc_abort:1036 - No such process - Failed to send SIGKILL via pidfd 43 for process 54158
lxc junk 20221025221048.304 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc junk 20221025221048.304 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 20221025221048.324 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20221025221048.324 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"

Container 3 log:

$ lxc info --show-log torrent
Name: torrent
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2021/11/16 17:47 EST
Last Used: 2022/10/25 19:17 EDT

Log:

lxc torrent 20221025231720.922 ERROR    conf - ../src/src/lxc/conf.c:run_buffer:321 - Script exited with status 32
lxc torrent 20221025231720.922 ERROR    start - ../src/src/lxc/start.c:lxc_init:844 - Failed to run lxc.hook.pre-start for container "torrent"
lxc torrent 20221025231720.922 ERROR    start - ../src/src/lxc/start.c:__lxc_start:2027 - Failed to initialize container "torrent"
lxc torrent 20221025231751.116 ERROR    lxccontainer - ../src/src/lxc/lxccontainer.c:wait_on_daemonized_start:869 - No such file or directory - Failed to receive the container state
lxc 20221025231751.116 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20221025231751.116 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"
lxc 20221025231751.116 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20221025231751.116 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"
lxc 20221025231751.119 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20221025231751.119 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"

Trying to start a copy (created with ‘lxc copy …’) results in the same failure, as does after restoring a snapshot. I was able to launch a couple of new Ubuntu Focal and Jammy containers as a test.
I really don’t know where to proceed from here to get these containers resuscitated.

This may not be relevant but it appears that around the time of reboot, snap refreshed LXD. I downgraded back to LXD 5.6 but it made no difference.

$ snap tasks 125
Status  Spawn               Ready               Summary
Done    today at 04:43 EDT  today at 04:43 EDT  Ensure prerequisites for "lxd" are available
Done    today at 04:43 EDT  today at 04:43 EDT  Download snap "lxd" (23853) from channel "latest/stable"
Done    today at 04:43 EDT  today at 04:43 EDT  Fetch and check assertions for snap "lxd" (23853)
Done    today at 04:43 EDT  today at 04:43 EDT  Mount snap "lxd" (23853)
Done    today at 04:43 EDT  today at 04:43 EDT  Run pre-refresh hook of "lxd" snap if present
Done    today at 04:43 EDT  today at 04:43 EDT  Stop snap "lxd" services
Done    today at 04:43 EDT  today at 04:43 EDT  Remove aliases for snap "lxd"
Done    today at 04:43 EDT  today at 04:43 EDT  Make current revision for snap "lxd" unavailable
Done    today at 04:43 EDT  today at 04:43 EDT  Copy snap "lxd" data
Done    today at 04:43 EDT  today at 04:43 EDT  Setup snap "lxd" (23853) security profiles
Done    today at 04:43 EDT  today at 04:43 EDT  Make snap "lxd" (23853) available to the system
Done    today at 04:43 EDT  today at 04:43 EDT  Automatically connect eligible plugs and slots of snap "lxd"
Done    today at 04:43 EDT  today at 04:43 EDT  Set automatic aliases for snap "lxd"
Done    today at 04:43 EDT  today at 04:43 EDT  Setup snap "lxd" aliases
Done    today at 04:43 EDT  today at 04:43 EDT  Run post-refresh hook of "lxd" snap if present
Done    today at 04:43 EDT  today at 04:44 EDT  Start snap "lxd" (23853) services
Done    today at 04:43 EDT  today at 04:44 EDT  Remove data for snap "lxd" (23537)
Done    today at 04:43 EDT  today at 04:44 EDT  Remove snap "lxd" (23537) from the system
Done    today at 04:43 EDT  today at 04:44 EDT  Clean up "lxd" (23853) install
Done    today at 04:43 EDT  today at 04:44 EDT  Run configure hook of "lxd" snap if present
Done    today at 04:43 EDT  today at 04:44 EDT  Run health check of "lxd" snap
Done    today at 04:43 EDT  today at 04:44 EDT  Handling re-refresh of "lxd" as needed

......................................................................
Stop snap "lxd" services

2022-10-25T04:43:33-04:00 INFO Waiting for "snap.lxd.daemon.service" to stop.

......................................................................
Handling re-refresh of "lxd" as needed

2022-10-25T04:44:21-04:00 INFO No re-refreshes found.

$ snap list
Name               Version           Rev    Tracking         Publisher   Notes
bare               1.0               5      latest/stable    canonical✓  base
core18             20220831          2566   latest/stable    canonical✓  base
core20             20220826          1623   latest/stable    canonical✓  base
gnome-3-34-1804    0+git.3556cb3     77     latest/stable/…  canonical✓  -
gnome-3-38-2004    0+git.6f39565     119    latest/stable    canonical✓  -
gtk-common-themes  0.1-81-g442e511   1535   latest/stable/…  canonical✓  -
lxd                5.7-749a602       23853  5.7/candidate    canonical✓  -
snap-store         41.3-64-g512c0ff  599    latest/stable/…  canonical✓  -
snapd              2.57.4            17336  latest/stable    canonical✓  snapd
$ lxc info
config:
  core.https_address: '[::]:8443'
  core.trust_password: true
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
  addresses:
  - 192.168.0.115:8443
  - 192.168.0.14:8443
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
[...]
    -----END CERTIFICATE-----
  certificate_fingerprint: [...]
  driver: lxc | qemu
  driver_version: 5.0.1 | 7.1.0
  firewall: nftables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    shiftfs: "true"
    uevent_injection: "true"
    unpriv_fscaps: "true"
  kernel_version: 5.15.0-52-generic
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Ubuntu
  os_version: "20.04"
  project: default
  server: lxd
  server_clustered: false
  server_event_mode: full-mesh
  server_name: prodesk
  server_pid: 149012
  server_version: "5.7"
  storage: zfs
  storage_version: 2.1.6-0york1~20.04
  storage_supported_drivers:
  - name: cephfs
    version: 15.2.16
    remote: true
  - name: cephobject
    version: 15.2.16
    remote: true
  - name: dir
    version: "1"
    remote: false
  - name: lvm
    version: 2.03.07(2) (2019-11-30) / 1.02.167 (2019-11-30) / 4.45.0
    remote: false
  - name: zfs
    version: 2.1.6-0york1~20.04
    remote: false
  - name: btrfs
    version: 5.4.1
    remote: false
  - name: ceph
    version: 15.2.16
    remote: true

Container 1 config:

$ lxc config show --expanded dev
architecture: x86_64
config:
  boot.autostart: "false"
  image.architecture: amd64
  image.description: ubuntu 20.04 LTS amd64 (release) (20210610)
  image.label: release
  image.os: ubuntu
  image.release: focal
  image.serial: "20210610"
  image.type: squashfs
  image.version: "20.04"
  security.nesting: "true"
  volatile.base_image: 9ba1aa2f5ddea5f6b239cb1c05af8e4482c7c252e2d95dafc32686e80af5e884
  volatile.eth0.hwaddr: 00:16:3e:53:75:d9
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: STOPPED
  volatile.last_state.ready: "false"
  volatile.uuid: 921ef088-38d1-44c0-b1a8-aaa42176cc4e
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: br0
    type: nic
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

Container 2 config:

$ lxc config show --expanded junk
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 20.04 LTS amd64 (release) (20211118)
  image.label: release
  image.os: ubuntu
  image.release: focal
  image.serial: "20211118"
  image.type: squashfs
  image.version: "20.04"
  volatile.base_image: 39bdbf191acd49807930a11b46f76d3d3b31f01efa1af5f26c40402f33b11426
  volatile.eth0.hwaddr: 00:16:3e:48:bf:28
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: STOPPED
  volatile.uuid: aaf144bf-7edb-40cf-9b9c-f588a3d4ccab
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: br0
    type: nic
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

Container 3 config:

$ lxc config show --expanded torrent
architecture: x86_64
config:
  boot.autostart: "true"
  image.architecture: amd64
  image.description: ubuntu 20.04 LTS amd64 (release) (20211108)
  image.label: release
  image.os: ubuntu
  image.release: focal
  image.serial: "20211108"
  image.type: squashfs
  image.version: "20.04"
  security.privileged: "false"
  volatile.base_image: bd2ffb937c95633a28091e6efc42d6c7b1474ad8eea80d6ed8df800e44c6bfdd
  volatile.eth0.host_name: vethf5e67955
  volatile.eth0.hwaddr: 00:16:3e:2a:90:1a
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: STOPPED
  volatile.last_state.ready: "false"
  volatile.uuid: 0a3bbf85-1c34-4d60-9201-5a5dbfe5e001
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: br0
    type: nic
  loot:
    path: /mnt/data-1/
    shift: "true"
    source: /mnt/data-1/
    type: disk
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

What error, if any do you get from lxc start?

What does /var/snap/lxd/common/lxd/logs/lxd.log show?

$ lxc start dev
Error: Failed to run: /snap/lxd/current/bin/lxd forkstart dev /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/dev/lxc.conf: exit status 1
Try `lxc info --show-log dev` for more info

$ sudo cat  /var/snap/lxd/common/lxd/logs/lxd.log
time="2022-10-25T23:28:34-04:00" level=warning msg=" - Couldn't find the CGroup blkio.weight, disk priority will be ignored"
time="2022-10-25T23:31:31-04:00" level=error msg="Failed starting container" action=start created="2021-06-19 13:05:01.959470674 -0400 -0400" ephemeral=false instance=dev instanceType=container project=default stateful=false used="2022-10-25 23:19:08.799428412 +0000 UTC"
time="2022-10-25T23:42:02-04:00" level=error msg="Failed starting container" action=start created="2021-06-19 13:05:01.959470674 -0400 -0400" ephemeral=false instance=dev instanceType=container project=default stateful=false used="2022-10-26 03:31:26.242780647 +0000 UTC"
time="2022-10-26T08:03:08-04:00" level=error msg="Failed starting container" action=start created="2021-06-19 13:05:01.959470674 -0400 -0400" ephemeral=false instance=dev instanceType=container project=default stateful=false used="2022-10-26 03:42:22.292703895 +0000 UTC"

The issue seems to be related to rights. The containers will start if I make them privileged (security.privileged: “true”).

When I set this system up about a year ago, I had the subuid and subgid files in /etc. While trying to dig into the issue yesterday, I copied them to /var/snap/lxd/common/etc/ as well, based on a post in this forum.
This morning, I cannot launch new containers either like I was able to prior to copying those files and restarting the snap.lxd.daemon service

$ lxc launch ubuntu:jammy jammy-test
Creating jammy-test
Starting jammy-test
Error: Failed to run: /snap/lxd/current/bin/lxd forkstart jammy-test /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/jammy-test/lxc.conf: exit status 1
Try `lxc info --show-log local:jammy-test` for more info

[lxd.log]
time="2022-10-26T08:21:52-04:00" level=error msg="Failed starting container" action=start created="2022-10-26 12:16:11.106152178 +0000 UTC" ephemeral=false instance=jammy-test instanceType=container project=default stateful=false used="2022-10-26 12:16:35.664400725 +0000 UTC"

root:/var/snap/lxd/common/lxd/logs/jammy-test# ll
total 26
drwxr-xr-x  2 root root    6 Oct 26 08:42 ./
drwx------ 17 root root   25 Oct 26 08:42 ../
-rw-r--r--  1 root root    0 Oct 26 08:42 forkstart.log
-rw-r-----  1 root root 2812 Oct 26 08:42 lxc.conf
-rw-r-----  1 root root 2434 Oct 26 08:42 lxc.log
-rw-r-----  1 root root  313 Oct 26 08:42 lxc.log.old

root:/var/snap/lxd/common/lxd/logs/jammy-test# cat lxc.conf
lxc.log.file = /var/snap/lxd/common/lxd/logs/jammy-test/lxc.log
lxc.log.level = warn
lxc.console.buffer.size = auto
lxc.console.size = auto
lxc.console.logfile = /var/snap/lxd/common/lxd/logs/jammy-test/console.log
lxc.sched.core = 1
lxc.mount.auto = proc:rw sys:rw cgroup:mixed
lxc.autodev = 1
lxc.pty.max = 1024
lxc.mount.entry = /dev/fuse dev/fuse none bind,create=file,optional 0 0
lxc.mount.entry = /dev/net/tun dev/net/tun none bind,create=file,optional 0 0
lxc.mount.entry = /proc/sys/fs/binfmt_misc proc/sys/fs/binfmt_misc none rbind,create=dir,optional 0 0
lxc.mount.entry = /sys/firmware/efi/efivars sys/firmware/efi/efivars none rbind,create=dir,optional 0 0
lxc.mount.entry = /sys/fs/fuse/connections sys/fs/fuse/connections none rbind,create=dir,optional 0 0
lxc.mount.entry = /sys/fs/pstore sys/fs/pstore none rbind,create=dir,optional 0 0
lxc.mount.entry = /sys/kernel/config sys/kernel/config none rbind,create=dir,optional 0 0
lxc.mount.entry = /sys/kernel/debug sys/kernel/debug none rbind,create=dir,optional 0 0
lxc.mount.entry = /sys/kernel/security sys/kernel/security none rbind,create=dir,optional 0 0
lxc.mount.entry = /sys/kernel/tracing sys/kernel/tracing none rbind,create=dir,optional 0 0
lxc.mount.entry = /dev/mqueue dev/mqueue none rbind,create=dir,optional 0 0
lxc.include = /snap/lxd/current/lxc/config//common.conf.d/
lxc.arch = linux64
lxc.hook.version = 1
lxc.hook.pre-start = /proc/674723/exe callhook /var/snap/lxd/common/lxd "default" "jammy-test" start
lxc.hook.stop = /snap/lxd/current/bin/lxd callhook /var/snap/lxd/common/lxd "default" "jammy-test" stopns
lxc.hook.post-stop = /snap/lxd/current/bin/lxd callhook /var/snap/lxd/common/lxd "default" "jammy-test" stop
lxc.tty.max = 0
lxc.uts.name = jammy-test
lxc.mount.entry = /var/snap/lxd/common/lxd/devlxd dev/lxd none bind,create=dir 0 0
lxc.apparmor.profile = lxd-jammy-test_</var/snap/lxd/common/lxd>//&:lxd-jammy-test_<var-snap-lxd-common-lxd>:
lxc.seccomp.profile = /var/snap/lxd/common/lxd/security/seccomp/jammy-test
lxc.idmap = u 0 1000000 1000000000
lxc.idmap = g 0 1000000 1000000000
lxc.mount.auto = shmounts:/var/snap/lxd/common/lxd/shmounts/jammy-test:/dev/.lxd-mounts
lxc.net.0.type = phys
lxc.net.0.name = eth0
lxc.net.0.flags = up
lxc.net.0.link = veth4fff3df6
lxc.rootfs.path = dir:/var/snap/lxd/common/lxd/storage-pools/default/containers/jammy-test/rootfs
lxc.hook.pre-start = /bin/mount -t shiftfs -o mark,passthrough=3 "/var/snap/lxd/common/lxd/containers/jammy-test/rootfs" "/var/snap/lxd/common/lxd/containers/jammy-test/rootfs"
lxc.hook.pre-mount = /bin/mount -t shiftfs -o passthrough=3 "/var/snap/lxd/common/lxd/containers/jammy-test/rootfs" "/var/snap/lxd/common/lxd/containers/jammy-test/rootfs"
lxc.hook.start-host = /bin/umount -l "/var/snap/lxd/common/lxd/containers/jammy-test/rootfs"

root:/var/snap/lxd/common/lxd/logs/jammy-test# cat lxc.log
lxc jammy-test 20221026124251.595 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc jammy-test 20221026124251.595 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc jammy-test 20221026124251.597 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc jammy-test 20221026124251.597 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc jammy-test 20221026124251.597 WARN     cgfsng - ../src/src/lxc/cgroups/cgfsng.c:fchowmodat:1611 - No such file or directory - Failed to fchownat(42, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
lxc jammy-test 20221026124251.822 ERROR    conf - ../src/src/lxc/conf.c:run_buffer:321 - Script exited with status 1
lxc jammy-test 20221026124251.822 ERROR    conf - ../src/src/lxc/conf.c:lxc_setup:4400 - Failed to run mount hooks
lxc jammy-test 20221026124251.822 ERROR    start - ../src/src/lxc/start.c:do_start:1272 - Failed to setup container "jammy-test"
lxc jammy-test 20221026124251.822 ERROR    sync - ../src/src/lxc/sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 4)
lxc jammy-test 20221026124251.835 WARN     network - ../src/src/lxc/network.c:lxc_delete_network_priv:3631 - Failed to rename interface with index 0 from "eth0" to its initial name "veth4fff3df6"
lxc jammy-test 20221026124251.835 ERROR    lxccontainer - ../src/src/lxc/lxccontainer.c:wait_on_daemonized_start:877 - Received container state "ABORTING" instead of "RUNNING"
lxc jammy-test 20221026124251.835 ERROR    start - ../src/src/lxc/start.c:__lxc_start:2107 - Failed to spawn container "jammy-test"
lxc jammy-test 20221026124251.835 WARN     start - ../src/src/lxc/start.c:lxc_abort:1036 - No such process - Failed to send SIGKILL via pidfd 43 for process 2114561
lxc jammy-test 20221026124256.902 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc jammy-test 20221026124256.902 WARN     conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 20221026124256.937 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20221026124256.937 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"

root:/var/snap/lxd/common/lxd/logs/jammy-test# cat lxc.log.old
lxc 20221026124256.937 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20221026124256.937 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"

The new container will launch if I make it privileged.

What does lxc storage show <pool> show and sudo snap get lxd shiftfs.enable show?

root@prodesk:/var/snap/lxd/common/lxd/logs/jammy-test# lxc storage show default
config:
  source: rpool/lxd
  volatile.initial_source: rpool/lxd
  zfs.pool_name: rpool/lxd
description: ""
name: default
driver: zfs
used_by:
- /1.0/images/4ca869945db4265e397925faf31fc93dac5cbbde0435aa6d3d12b15e2d85fe69
- /1.0/images/85e44c519b2a1e81f76a388ef48ab7f3308acc5fe10dd61ab7f394179d746bf6
- /1.0/images/c10ff29615c4bad875ba43079c149435a6061946c52f3b5a19cf54779618248c
- /1.0/images/c765eaeaa7fb0c095c39d2176566e7abcda1fdf6bf7bac58843f5f0abbbeed82
- /1.0/instances/adsb
- /1.0/instances/adsb/snapshots/v2
- /1.0/instances/ar
- /1.0/instances/ar/snapshots/b4-mono-upgrade
- /1.0/instances/biblio
- /1.0/instances/biblio/snapshots/snapshot-20221023-0
- /1.0/instances/dev
- /1.0/instances/dev/snapshots/b4getssl
- /1.0/instances/focal-test
- /1.0/instances/grafana
- /1.0/instances/jammy-test
- /1.0/instances/junk
- /1.0/instances/silk
- /1.0/instances/silk/snapshots/Silk_OK
- /1.0/instances/silk/snapshots/b4silk
- /1.0/instances/silk/snapshots/silk_fv_ok_5
- /1.0/instances/silk/snapshots/silk_fv_ok_6
- /1.0/instances/silk/snapshots/silk_ok_3
- /1.0/instances/silk/snapshots/silk_ok_4
- /1.0/instances/t-p
- /1.0/instances/t-r
- /1.0/instances/t-r/snapshots/t-r
- /1.0/instances/torrent
- /1.0/profiles/default
status: Created
locations:
- none

root@prodesk:/var/snap/lxd/common/lxd/logs/jammy-test# snap get lxd shiftfs.enable
true

Does a new instance launch OK on the same storage pool?

A new instance does not launch. LXD on this system has a single storage pool.

Can you do snap unset lxd shiftfs.enable followed by sudo systemctl reload snap.lxd.daemon and see if the new instance will launch.

I don’t think this will fix your existing instance as you’re using shift=true, but lets try and narrow down the problem to shiftfs.

After disabling shiftfs, a new instance will launch.

I don’t have this defined, do I need it?

/etc/default/lxc:

lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536

per this post https://discuss.linuxcontainers.org/t/solved-arch-linux-containers-only-run-when-security-privileged-true/4006/5?u=blurry

No that file is for lxc not lxd.

What is the output of uname -a?

$ uname -a
Linux prodesk 5.15.0-52-generic #58~20.04.1-Ubuntu SMP Thu Oct 13 13:09:46 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

It looks like you’re affected by this kernel bug:

Which isn’t related to LXD 5.7.

1 Like

Thanks so much for pinpointing the cause! I will roll back to a previous kernel.

1 Like