LXC Arch Linux froze PC with raw.lxc: | lxc.mount.auto=sys:rw

Issue description

I’m trying to install kubernetes on a few lxc ubuntu containers which need to have some raw.lxc properties profile for the image. In the lxc profile, I have this piece of properties:

  raw.lxc: |
    lxc.mount.auto=sys:rw

which froze my pc after. Note that I have tried lxc.mount.auto=proc:rw which worked fine. And I also tried all 3 different storage types. I have also tried to wait it out to see if it will goes away because it needs longer start up time but after 30’-1hr, it still hangs. I also tried to removed my storage, network but it didn’t work either. Below is my full profile config:

config:
  limits.cpu: "2"
  limits.memory: 2GB
  limits.memory.swap: "false"
  linux.kernel_modules: ip_tables,ip6_tables,nf_nat,overlay,br_netfilter
  raw.lxc: |
    lxc.apparmor.profile=unconfined
    lxc.cap.drop=
    lxc.cgroup.devices.allow=a
    lxc.mount.auto=proc:rw sys:rw
  security.privileged: "true"
  security.nesting: "true"
description: LXD profile for Kubernetes
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  kmsg:
    path: /dev/kmsg
    source: /dev/kmsg
    type: unix-char
  root:
    path: /
    pool: default
    type: disk
name: k8s
used_by: []

Required information

  • Distribution:
    arch linux

  • Distribution version:
    6.6.10-arch1-1

  • The output of “lxc info” or if that fails:

    config: {}
    api_extensions:
    - storage_zfs_remove_snapshots
    - container_host_shutdown_timeout
    - container_stop_priority
    - container_syscall_filtering
    - auth_pki
    - container_last_used_at
    - etag
    - patch
    - usb_devices
    - https_allowed_credentials
    - image_compression_algorithm
    - directory_manipulation
    - container_cpu_time
    - storage_zfs_use_refquota
    - storage_lvm_mount_options
    - network
    - profile_usedby
    - container_push
    - container_exec_recording
    - certificate_update
    - container_exec_signal_handling
    - gpu_devices
    - container_image_properties
    - migration_progress
    - id_map
    - network_firewall_filtering
    - network_routes
    - storage
    - file_delete
    - file_append
    - network_dhcp_expiry
    - storage_lvm_vg_rename
    - storage_lvm_thinpool_rename
    - network_vlan
    - image_create_aliases
    - container_stateless_copy
    - container_only_migration
    - storage_zfs_clone_copy
    - unix_device_rename
    - storage_lvm_use_thinpool
    - storage_rsync_bwlimit
    - network_vxlan_interface
    - storage_btrfs_mount_options
    - entity_description
    - image_force_refresh
    - storage_lvm_lv_resizing
    - id_map_base
    - file_symlinks
    - container_push_target
    - network_vlan_physical
    - storage_images_delete
    - container_edit_metadata
    - container_snapshot_stateful_migration
    - storage_driver_ceph
    - storage_ceph_user_name
    - resource_limits
    - storage_volatile_initial_source
    - storage_ceph_force_osd_reuse
    - storage_block_filesystem_btrfs
    - resources
    - kernel_limits
    - storage_api_volume_rename
    - macaroon_authentication
    - network_sriov
    - console
    - restrict_devlxd
    - migration_pre_copy
    - infiniband
    - maas_network
    - devlxd_events
    - proxy
    - network_dhcp_gateway
    - file_get_symlink
    - network_leases
    - unix_device_hotplug
    - storage_api_local_volume_handling
    - operation_description
    - clustering
    - event_lifecycle
    - storage_api_remote_volume_handling
    - nvidia_runtime
    - container_mount_propagation
    - container_backup
    - devlxd_images
    - container_local_cross_pool_handling
    - proxy_unix
    - proxy_udp
    - clustering_join
    - proxy_tcp_udp_multi_port_handling
    - network_state
    - proxy_unix_dac_properties
    - container_protection_delete
    - unix_priv_drop
    - pprof_http
    - proxy_haproxy_protocol
    - network_hwaddr
    - proxy_nat
    - network_nat_order
    - container_full
    - candid_authentication
    - backup_compression
    - candid_config
    - nvidia_runtime_config
    - storage_api_volume_snapshots
    - storage_unmapped
    - projects
    - candid_config_key
    - network_vxlan_ttl
    - container_incremental_copy
    - usb_optional_vendorid
    - snapshot_scheduling
    - snapshot_schedule_aliases
    - container_copy_project
    - clustering_server_address
    - clustering_image_replication
    - container_protection_shift
    - snapshot_expiry
    - container_backup_override_pool
    - snapshot_expiry_creation
    - network_leases_location
    - resources_cpu_socket
    - resources_gpu
    - resources_numa
    - kernel_features
    - id_map_current
    - event_location
    - storage_api_remote_volume_snapshots
    - network_nat_address
    - container_nic_routes
    - rbac
    - cluster_internal_copy
    - seccomp_notify
    - lxc_features
    - container_nic_ipvlan
    - network_vlan_sriov
    - storage_cephfs
    - container_nic_ipfilter
    - resources_v2
    - container_exec_user_group_cwd
    - container_syscall_intercept
    - container_disk_shift
    - storage_shifted
    - resources_infiniband
    - daemon_storage
    - instances
    - image_types
    - resources_disk_sata
    - clustering_roles
    - images_expiry
    - resources_network_firmware
    - backup_compression_algorithm
    - ceph_data_pool_name
    - container_syscall_intercept_mount
    - compression_squashfs
    - container_raw_mount
    - container_nic_routed
    - container_syscall_intercept_mount_fuse
    - container_disk_ceph
    - virtual-machines
    - image_profiles
    - clustering_architecture
    - resources_disk_id
    - storage_lvm_stripes
    - vm_boot_priority
    - unix_hotplug_devices
    - api_filtering
    - instance_nic_network
    - clustering_sizing
    - firewall_driver
    - projects_limits
    - container_syscall_intercept_hugetlbfs
    - limits_hugepages
    - container_nic_routed_gateway
    - projects_restrictions
    - custom_volume_snapshot_expiry
    - volume_snapshot_scheduling
    - trust_ca_certificates
    - snapshot_disk_usage
    - clustering_edit_roles
    - container_nic_routed_host_address
    - container_nic_ipvlan_gateway
    - resources_usb_pci
    - resources_cpu_threads_numa
    - resources_cpu_core_die
    - api_os
    - container_nic_routed_host_table
    - container_nic_ipvlan_host_table
    - container_nic_ipvlan_mode
    - resources_system
    - images_push_relay
    - network_dns_search
    - container_nic_routed_limits
    - instance_nic_bridged_vlan
    - network_state_bond_bridge
    - usedby_consistency
    - custom_block_volumes
    - clustering_failure_domains
    - resources_gpu_mdev
    - console_vga_type
    - projects_limits_disk
    - network_type_macvlan
    - network_type_sriov
    - container_syscall_intercept_bpf_devices
    - network_type_ovn
    - projects_networks
    - projects_networks_restricted_uplinks
    - custom_volume_backup
    - backup_override_name
    - storage_rsync_compression
    - network_type_physical
    - network_ovn_external_subnets
    - network_ovn_nat
    - network_ovn_external_routes_remove
    - tpm_device_type
    - storage_zfs_clone_copy_rebase
    - gpu_mdev
    - resources_pci_iommu
    - resources_network_usb
    - resources_disk_address
    - network_physical_ovn_ingress_mode
    - network_ovn_dhcp
    - network_physical_routes_anycast
    - projects_limits_instances
    - network_state_vlan
    - instance_nic_bridged_port_isolation
    - instance_bulk_state_change
    - network_gvrp
    - instance_pool_move
    - gpu_sriov
    - pci_device_type
    - storage_volume_state
    - network_acl
    - migration_stateful
    - disk_state_quota
    - storage_ceph_features
    - projects_compression
    - projects_images_remote_cache_expiry
    - certificate_project
    - network_ovn_acl
    - projects_images_auto_update
    - projects_restricted_cluster_target
    - images_default_architecture
    - network_ovn_acl_defaults
    - gpu_mig
    - project_usage
    - network_bridge_acl
    - warnings
    - projects_restricted_backups_and_snapshots
    - clustering_join_token
    - clustering_description
    - server_trusted_proxy
    - clustering_update_cert
    - storage_api_project
    - server_instance_driver_operational
    - server_supported_storage_drivers
    - event_lifecycle_requestor_address
    - resources_gpu_usb
    - clustering_evacuation
    - network_ovn_nat_address
    - network_bgp
    - network_forward
    - custom_volume_refresh
    - network_counters_errors_dropped
    - metrics
    - image_source_project
    - clustering_config
    - network_peer
    - linux_sysctl
    - network_dns
    - ovn_nic_acceleration
    - certificate_self_renewal
    - instance_project_move
    - storage_volume_project_move
    - cloud_init
    - network_dns_nat
    - database_leader
    - instance_all_projects
    - clustering_groups
    - ceph_rbd_du
    - instance_get_full
    - qemu_metrics
    - gpu_mig_uuid
    - event_project
    - clustering_evacuation_live
    - instance_allow_inconsistent_copy
    - network_state_ovn
    - storage_volume_api_filtering
    - image_restrictions
    - storage_zfs_export
    - network_dns_records
    - storage_zfs_reserve_space
    - network_acl_log
    - storage_zfs_blocksize
    - metrics_cpu_seconds
    - instance_snapshot_never
    - certificate_token
    - instance_nic_routed_neighbor_probe
    - event_hub
    - agent_nic_config
    - projects_restricted_intercept
    - metrics_authentication
    - images_target_project
    - cluster_migration_inconsistent_copy
    - cluster_ovn_chassis
    - container_syscall_intercept_sched_setscheduler
    - storage_lvm_thinpool_metadata_size
    - storage_volume_state_total
    - instance_file_head
    - instances_nic_host_name
    - image_copy_profile
    - container_syscall_intercept_sysinfo
    - clustering_evacuation_mode
    - resources_pci_vpd
    - qemu_raw_conf
    - storage_cephfs_fscache
    - network_load_balancer
    - vsock_api
    - instance_ready_state
    - network_bgp_holdtime
    - storage_volumes_all_projects
    - metrics_memory_oom_total
    - storage_buckets
    - storage_buckets_create_credentials
    - metrics_cpu_effective_total
    - projects_networks_restricted_access
    - storage_buckets_local
    - loki
    - acme
    - internal_metrics
    - cluster_join_token_expiry
    - remote_token_expiry
    - init_preseed
    - storage_volumes_created_at
    - cpu_hotplug
    - projects_networks_zones
    - network_txqueuelen
    - cluster_member_state
    - instances_placement_scriptlet
    - storage_pool_source_wipe
    - zfs_block_mode
    - instance_generation_id
    - disk_io_cache
    - amd_sev
    - storage_pool_loop_resize
    - migration_vm_live
    - ovn_nic_nesting
    - oidc
    - network_ovn_l3only
    - ovn_nic_acceleration_vdpa
    - cluster_healing
    - instances_state_total
    - auth_user
    - security_csm
    - instances_rebuild
    - numa_cpu_placement
    - custom_volume_iso
    - network_allocations
    - storage_api_remote_volume_snapshot_copy
    - zfs_delegate
    - operations_get_query_all_projects
    - metadata_configuration
    - syslog_socket
    - event_lifecycle_name_and_project
    - instances_nic_limits_priority
    - disk_initial_volume_configuration
    - operation_wait
    - cluster_internal_custom_volume_copy
    - disk_io_bus
    - storage_cephfs_create_missing
    - instance_move_config
    api_status: stable
    api_version: "1.0"
    auth: trusted
    public: false
    auth_methods:
    - tls
    auth_user_name: wpham
    auth_user_method: unix
    environment:
      addresses: []
      architectures:
      - x86_64
      - i686
      certificate: |
        -----BEGIN CERTIFICATE-----
        MIIB7TCCAXOgAwIBAgIRAN+TwLBtKAL0WpGZ20Q/kYowCgYIKoZIzj0EAwMwJzEM
        MAoGA1UEChMDTFhEMRcwFQYDVQQDDA5yb290QGFyY2hsaW51eDAeFw0yNDAxMjQw
        NDMzMzRaFw0zNDAxMjEwNDMzMzRaMCcxDDAKBgNVBAoTA0xYRDEXMBUGA1UEAwwO
        cm9vdEBhcmNobGludXgwdjAQBgcqhkjOPQIBBgUrgQQAIgNiAATPM4qR3WnQmpqq
        eIzd0QtreB0hf+04QAl1+zA/7NGQGy/yvjd+ceguyq5sTJx0KZrodt74QOsIuv2n
        AFB5cNzLhQmx7m0cA9H8Pz+78TvaBNKAqVBN5J/WcKR48yeSULejYzBhMA4GA1Ud
        DwEB/wQEAwIFoDATBgNVHSUEDDAKBggrBgEFBQcDATAMBgNVHRMBAf8EAjAAMCwG
        A1UdEQQlMCOCCWFyY2hsaW51eIcEfwAAAYcQAAAAAAAAAAAAAAAAAAAAATAKBggq
        hkjOPQQDAwNoADBlAjBM6ra5rtrOwffAlk4YDlOgTSGZa4kQE6XuMsLYECTZ7uSF
        yo32pnHZXHjZ9qOBANECMQCPSxKxGUtI5ogM2pK9RXfIhtumvTZOPXIn+e9alFj+
        SYHQQldICcjnEXI//RRoHBs=
        -----END CERTIFICATE-----
      certificate_fingerprint: c977efc0340356f5f7f5b1099599565681862c27890f91c8e2df1873b573f07e
      driver: qemu | lxc
      driver_version: 8.2.0 | 5.0.3
      firewall: nftables
      kernel: Linux
      kernel_architecture: x86_64
      kernel_features:
        idmapped_mounts: "true"
        netnsid_getifaddrs: "true"
        seccomp_listener: "true"
        seccomp_listener_continue: "true"
        uevent_injection: "true"
        unpriv_fscaps: "true"
      kernel_version: 6.6.10-arch1-1
      lxc_features:
        cgroup2: "true"
        core_scheduling: "true"
        devpts_fd: "true"
        idmapped_mounts_v2: "true"
        mount_injection_file: "true"
        network_gateway_device_route: "true"
        network_ipvlan: "true"
        network_l2proxy: "true"
        network_phys_macvlan_mtu: "true"
        network_veth_router: "true"
        pidfd: "true"
        seccomp_allow_deny_syntax: "true"
        seccomp_notify: "true"
        seccomp_proxy_send_notify_fd: "true"
      os_name: Arch Linux
      os_version: ""
      project: default
      server: lxd
      server_clustered: false
      server_event_mode: full-mesh
      server_name: archlinux
      server_pid: 21685
      server_version: "5.20"
      storage: dir
      storage_version: "1"
      storage_supported_drivers:
      - name: dir
        version: "1"
        remote: false
      - name: lvm
        version: 2.03.22(2) (2023-08-02) / 1.02.196 (2023-08-02) / 4.48.0
        remote: false
      - name: btrfs
        version: 6.6.3
        remote: false
  • nproc
    32

  • free -h
    total used free shared buff/cache available
    Mem: 62Gi 10Gi 48Gi 3.8Gi 7.6Gi 51Gi
    Swap: 4.0Gi 0B 4.0Gi

  • cat /proc/self/cgroup
    0::/user.slice/user-1000.slice/user@1000.service/app.slice/app-org.kde.konsole-f5b5472244e14e3bbe022785a9b9cb9b.scope

  • cat /proc/1/mounts
    /dev/sdd2 / ext4 rw,relatime 0 0
    devtmpfs /dev devtmpfs rw,nosuid,size=4096k,nr_inodes=8217104,mode=755,inode64 0 0
    tmpfs /dev/shm tmpfs rw,nosuid,nodev,inode64 0 0
    devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
    sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
    securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
    cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
    pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
    efivarfs /sys/firmware/efi/efivars efivarfs rw,nosuid,nodev,noexec,relatime 0 0
    bpf /sys/fs/bpf bpf rw,nosuid,nodev,noexec,relatime,mode=700 0 0
    proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
    tmpfs /run tmpfs rw,nosuid,nodev,size=13150976k,nr_inodes=819200,mode=755,inode64 0 0
    systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=36,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=5167 0 0
    hugetlbfs /dev/hugepages hugetlbfs rw,nosuid,nodev,relatime,pagesize=2M 0 0
    mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
    debugfs /sys/kernel/debug debugfs rw,nosuid,nodev,noexec,relatime 0 0
    tracefs /sys/kernel/tracing tracefs rw,nosuid,nodev,noexec,relatime 0 0
    configfs /sys/kernel/config configfs rw,nosuid,nodev,noexec,relatime 0 0
    fusectl /sys/fs/fuse/connections fusectl rw,nosuid,nodev,noexec,relatime 0 0
    /dev/sdd1 /boot vfat rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro 0 0
    tmpfs /tmp tmpfs rw,nosuid,nodev,nr_inodes=1048576,inode64 0 0
    binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,nosuid,nodev,noexec,relatime 0 0
    tmpfs /run/user/1000 tmpfs rw,nosuid,nodev,relatime,size=6575484k,nr_inodes=1643871,mode=700,uid=1000,gid=1000,inode64 0 0
    portal /run/user/1000/doc fuse.portal rw,nosuid,nodev,relatime,user_id=1000,group_id=1000 0 0
    lxcfs /var/lib/lxcfs fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
    tmpfs /var/lib/lxd/shmounts tmpfs rw,relatime,size=100k,mode=711,inode64 0 0
    tmpfs /var/lib/lxd/devlxd tmpfs rw,relatime,size=100k,mode=755,inode64 0 0

Information to attach

  • subgid and subuid
    user:100000:65536
    root:1000000:1000000000

  • /etc/lxc/default.conf
    lxc.net.0.type = veth
    lxc.net.0.link = lxcbr0
    lxc.net.0.flags = up
    lxc.net.0.hwaddr = 00:16:3e:xx:xx:xx

  • /etc/default/lxc-net
    USE_LXC_BRIDGE=“true”
    LXC_BRIDGE=“lxcbr0”
    LXC_ADDR=“10.0.3.1”
    LXC_NETMASK=“255.255.255.0”
    LXC_NETWORK=“10.0.3.0/24”
    LXC_DHCP_RANGE=“10.0.3.2,10.0.3.254”
    LXC_DHCP_MAX=“253”

  • Any relevant kernel output (dmesg)
    [ 709.848010] lxdbr0: port 1(veth29f91374) entered blocking state
    [ 709.848025] lxdbr0: port 1(veth29f91374) entered disabled state
    [ 709.848035] veth29f91374: entered allmulticast mode
    [ 709.848085] veth29f91374: entered promiscuous mode
    [ 709.848135] lxdbr0: port 1(veth29f91374) entered blocking state
    [ 709.848138] lxdbr0: port 1(veth29f91374) entered forwarding state
    [ 709.848383] lxdbr0: port 1(veth29f91374) entered disabled state
    [ 709.857968] veth29f91374: left allmulticast mode
    [ 709.857971] veth29f91374: left promiscuous mode
    [ 709.857992] lxdbr0: port 1(veth29f91374) entered disabled state
    [ 714.977598] lxdbr0: port 1(veth3bc69d9b) entered blocking state
    [ 714.977607] lxdbr0: port 1(veth3bc69d9b) entered disabled state
    [ 714.977619] veth3bc69d9b: entered allmulticast mode
    [ 714.977664] veth3bc69d9b: entered promiscuous mode
    [ 714.984166] veth3bc69d9b: left allmulticast mode
    [ 714.984169] veth3bc69d9b: left promiscuous mode
    [ 714.984191] lxdbr0: port 1(veth3bc69d9b) entered disabled state
    [ 720.103524] lxdbr0: port 1(veth80f7c88e) entered blocking state
    [ 720.103528] lxdbr0: port 1(veth80f7c88e) entered disabled state
    [ 720.103535] veth80f7c88e: entered allmulticast mode
    [ 720.103577] veth80f7c88e: entered promiscuous mode
    [ 720.111064] veth80f7c88e: left allmulticast mode
    [ 720.111067] veth80f7c88e: left promiscuous mode
    [ 720.111090] lxdbr0: port 1(veth80f7c88e) entered disabled state

  • Container log (lxc info NAME --show-log)
    Name: kmaster
    Status: STOPPED
    Type: container
    Architecture: x86_64
    Created: 2024/01/25 09:45 AEDT
    Last Used: 2024/01/25 09:45 AEDT

Log:

  • Container configuration (lxc config show NAME --expanded)
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 22.04 LTS amd64 (release) (20240123)
  image.label: release
  image.os: ubuntu
  image.release: jammy
  image.serial: "20240123"
  image.type: squashfs
  image.version: "22.04"
  limits.cpu: "2"
  limits.memory: 2GB
  limits.memory.swap: "false"
  linux.kernel_modules: ip_tables,ip6_tables,nf_nat,overlay,br_netfilter
  raw.lxc: |
    lxc.apparmor.profile=unconfined
    lxc.cap.drop=
    lxc.cgroup.devices.allow=a
    lxc.mount.auto=proc:rw sys:rw
  security.nesting: "true"
  security.privileged: "true"
  volatile.base_image: 1d32d0a8227079e9175b0127ae357c7f4890b5b25f1112ff195bee28338a3ed1
  volatile.cloud-init.instance-id: 0da60c5b-1deb-4bd8-9267-9916529ee5e3
  volatile.eth0.hwaddr: 00:16:3e:f1:f4:bb
  volatile.idmap.base: "0"
  volatile.idmap.current: '[]'
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
  volatile.uuid: 49e984b1-cc53-4fe8-b6ce-09a707695ebb
  volatile.uuid.generation: 49e984b1-cc53-4fe8-b6ce-09a707695ebb
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  kmsg:
    path: /dev/kmsg
    source: /dev/kmsg
    type: unix-char
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- k8s
stateful: false
description: ""
  • Main daemon log (at /var/log/lxd/lxd.log or /var/snap/lxd/common/lxd/logs/lxd.log)
    Nothing related at the time I started the container
  • Output of the daemon with --debug (alternatively output of lxc monitor while reproducing the issue)
    It froze while doing lxc monitor so I can’t retrieve the logs. But based on what I have seen, nothing stand out

I have also tried to switch it to sys:mixed which works fine. So I reckon it’s something relating to permission?

I have tested using sys:rw with --vm, and it worked. I wonder what’s the difference