Lxd 4.0 cluster cannot launch instance frequently?

I use 5 machines to create a LXD clustering for testing. I lanch instance one after another.

lxc launch ubuntu:18.04 test1
lxc launch ubuntu:18.04 test2
lxc launch ubuntu:18.04 test3
lxc launch ubuntu:18.04 test4
...

However, after several tries, maybe 3 times, LXD will show

Creating test4
Error: Failed instance creation: Locate image 2cfc5a5567b8d74c0986f3d8a77a2a78e58fe22ea9abd2693112031f85afa1a1 in the cluster: image not available on any online node

But the status of the cluster is OK.


The image mabe seem OK.

The storage:

I don’t know how to fix it. Is this an issue or something I do wrong?
Thanks in advance.

That’s odd. Could you show lxd sql global "SELECT * FROM images;" and lxd sql global "SELECT * FROM images_noes;"?


I fill an issue report in github. Hope these info could help. Thank you.

Base information

  • Distribution: Ubuntu
  • Distribution version: 18.04.4 LTS (Bionic Beaver)
  • Kernel version: Linux machine1 4.15.0-88-generic #88-Ubuntu SMP Tue Feb 11 20:11:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • LXC version: 4.0.0
  • LXD version: 4.0.0
  • Snap revision: 14663

Detailed snap information

name:      lxd
summary:   System container manager and API
publisher: Canonical✓
store-url: https://snapcraft.io/lxd
contact:   https://github.com/lxc/lxd/issues
license:   unset
description: |
  **LXD is a system container manager**

  With LXD you can run hundreds of containers of a variety of Linux
  distributions, apply resource limits, pass in directories, USB devices
  or GPUs and setup any network and storage you want.

  LXD containers are lightweight, secure by default and a great
  alternative to running Linux virtual machines.


  **Run any Linux distribution you want**

  Pre-made images are available for Ubuntu, Alpine Linux, ArchLinux,
  CentOS, Debian, Fedora, Gentoo, OpenSUSE and more.

  A full list of available images can be found here: https://images.linuxcontainers.org

  Can't find the distribution you want? It's easy to make your own images too, either using our
  `distrobuilder` tool or by assembling your own image tarball by hand.


  **Containers at scale**

  LXD is network aware and all interactions go through a simple REST API,
  making it possible to remotely interact with containers on remote
  systems, copying and moving them as you wish.

  Want to go big? LXD also has built-in clustering support,
  letting you turn dozens of servers into one big LXD server.


  **Configuration options**

  Supported options for the LXD snap (`snap set lxd KEY=VALUE`):
   - criu.enable: Enable experimental live-migration support [default=false]
   - daemon.debug: Increases logging to debug level [default=false]
   - daemon.group: Group of users that can interact with LXD [default=lxd]
   - ceph.builtin: Use snap-specific ceph configuration [default=false]
   - openvswitch.builtin: Run a snap-specific OVS daemon [default=false]

  Documentation: https://lxd.readthedocs.io
commands:
  - lxd.benchmark
  - lxd.buginfo
  - lxd.check-kernel
  - lxd.lxc
  - lxd
  - lxd.migrate
services:
  lxd.activate: oneshot, enabled, inactive
  lxd.daemon:   simple, enabled, active
snap-id:      J60k4JY0HppjwOjW8dZdYc8obXKxujRu
tracking:     latest/stable
refresh-date: today at 15:56 CST
channels:
  latest/stable:    4.0.0       2020-04-17 (14663) 62MB -
  latest/candidate: 4.0.0       2020-04-17 (14663) 62MB -
  latest/beta:      ↑
  latest/edge:      git-0c71615 2020-04-17 (14671) 62MB -
  4.0/stable:       4.0.0       2020-04-17 (14663) 62MB -
  4.0/candidate:    4.0.0       2020-04-17 (14663) 62MB -
  4.0/beta:         ↑
  4.0/edge:         ↑
  3.23/stable:      3.23        2020-03-30 (14133) 70MB -
  3.23/candidate:   3.23        2020-03-30 (14133) 70MB -
  3.23/beta:        ↑
  3.23/edge:        ↑
  3.22/stable:      3.22        2020-03-18 (13901) 70MB -
  3.22/candidate:   3.22        2020-03-19 (13911) 70MB -
  3.22/beta:        ↑
  3.22/edge:        ↑
  3.21/stable:      3.21        2020-02-24 (13522) 69MB -
  3.21/candidate:   3.21        2020-03-04 (13588) 69MB -
  3.21/beta:        ↑
  3.21/edge:        ↑
  3.20/stable:      3.20        2020-02-06 (13300) 69MB -
  3.20/candidate:   3.20        2020-02-06 (13300) 69MB -
  3.20/beta:        ↑
  3.20/edge:        ↑
  3.19/stable:      3.19        2020-01-27 (13162) 67MB -
  3.19/candidate:   3.19        2020-01-27 (13162) 67MB -
  3.19/beta:        ↑
  3.19/edge:        ↑
  3.18/stable:      3.18        2019-12-02 (12631) 57MB -
  3.18/candidate:   3.18        2019-12-02 (12631) 57MB -
  3.18/beta:        ↑
  3.18/edge:        ↑
  3.0/stable:       3.0.4       2019-10-10 (11348) 55MB -
  3.0/candidate:    3.0.4       2019-10-10 (11348) 55MB -
  3.0/beta:         ↑
  3.0/edge:         git-81b81b9 2019-10-10 (11362) 55MB -
  2.0/stable:       2.0.11      2019-10-10  (8023) 28MB -
  2.0/candidate:    2.0.11      2019-10-10  (8023) 28MB -
  2.0/beta:         ↑
  2.0/edge:         git-160221d 2020-01-13 (12854) 27MB -
installed:          4.0.0                  (14663) 62MB -

Detailed LXD information

Daemon configuration

config:
  cluster.https_address: 172.26.140.101:8443
  core.https_address: 172.26.140.101:8443
  core.trust_password: true
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- resources_system
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
  addresses:
  - 172.26.140.101:8443
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIICCjCCAZCgAwIBAgIRAOC8PbDgCl1kfkP5Xt8ImKgwCgYIKoZIzj0EAwMwNjEc
    MBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzEWMBQGA1UEAwwNcm9vdEBtYWNo
    aW5lMTAeFw0yMDAzMzExMzAyMDVaFw0zMDAzMjkxMzAyMDVaMDYxHDAaBgNVBAoT
    E2xpbnV4Y29udGFpbmVycy5vcmcxFjAUBgNVBAMMDXJvb3RAbWFjaGluZTEwdjAQ
    BgcqhkjOPQIBBgUrgQQAIgNiAATsEFeixQD1UvD54y2VEaf2ssxbcf07U07ptK3n
    1064CoxMQn+mnynybkXbRShSRTihWuQGfTuDbsLlwZcb3YNmi+o8vIbOMmMGWewi
    BNmu6P/YWvkyvZNciCGfrm4FY1ajYjBgMA4GA1UdDwEB/wQEAwIFoDATBgNVHSUE
    DDAKBggrBgEFBQcDATAMBgNVHRMBAf8EAjAAMCsGA1UdEQQkMCKCCG1hY2hpbmUx
    hwR/AAABhxAAAAAAAAAAAAAAAAAAAAABMAoGCCqGSM49BAMDA2gAMGUCMAlF/tlz
    taJCVmIUXk17wWR4s0aPSaTxM/sC+BLIHvX6tZ/4/ZM6clBEMQs9FiUFGQIxAJ3n
    YefP+WzW62uVmYqHXxvIjrWDnAN+uMH+MHSABk9iqscO+sR8rMjNyF327Eg3hw==
    -----END CERTIFICATE-----
  certificate_fingerprint: 950458db1cfd70b1c1ebe827718d5a18cc1b65eb4d32d2d0f7f76c8e45cbdbeb
  driver: lxc
  driver_version: 4.0.2
  firewall: xtables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    netnsid_getifaddrs: "false"
    seccomp_listener: "false"
    seccomp_listener_continue: "false"
    shiftfs: "false"
    uevent_injection: "false"
    unpriv_fscaps: "true"
  kernel_version: 4.15.0-88-generic
  lxc_features:
    cgroup2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    seccomp_notify: "true"
  os_name: Ubuntu
  os_version: "18.04"
  project: default
  server: lxd
  server_clustered: true
  server_name: machine1
  server_pid: 18677
  server_version: 4.0.0
  storage: dir
  storage_version: "1"

Instances

+-------+---------+----------------------+------+-----------+-----------+----------+
| NAME  |  STATE  |         IPV4         | IPV6 |   TYPE    | SNAPSHOTS | LOCATION |
+-------+---------+----------------------+------+-----------+-----------+----------+
| test1 | RUNNING | 240.101.0.38 (eth0)  |      | CONTAINER | 0         | machine1 |
+-------+---------+----------------------+------+-----------+-----------+----------+
| test2 | RUNNING | 240.102.0.205 (eth0) |      | CONTAINER | 0         | machine2 |
+-------+---------+----------------------+------+-----------+-----------+----------+
| test3 | RUNNING | 240.103.0.179 (eth0) |      | CONTAINER | 0         | machine3 |
+-------+---------+----------------------+------+-----------+-----------+----------+
| test4 | RUNNING | 240.104.0.95 (eth0)  |      | CONTAINER | 0         | machine4 |
+-------+---------+----------------------+------+-----------+-----------+----------+
| test5 | RUNNING | 240.101.0.175 (eth0) |      | CONTAINER | 0         | machine1 |
+-------+---------+----------------------+------+-----------+-----------+----------+
| test6 | RUNNING | 240.102.0.131 (eth0) |      | CONTAINER | 0         | machine2 |
+-------+---------+----------------------+------+-----------+-----------+----------+

Images

+--------------+--------------+--------+---------------------------------------------+--------------+-----------+----------+------------------------------+
|    ALIAS     | FINGERPRINT  | PUBLIC |                 DESCRIPTION                 | ARCHITECTURE |   TYPE    |   SIZE   |         UPLOAD DATE          |
+--------------+--------------+--------+---------------------------------------------+--------------+-----------+----------+------------------------------+
| ubuntu-18.04 | 2cfc5a5567b8 | no     | ubuntu 18.04 LTS amd64 (release) (20200407) | x86_64       | CONTAINER | 179.02MB | Apr 17, 2020 at 2:26pm (UTC) |
+--------------+--------------+--------+---------------------------------------------+--------------+-----------+----------+------------------------------+

Storage pools

+-------+-------------+--------+---------+---------+
| NAME  | DESCRIPTION | DRIVER |  STATE  | USED BY |
+-------+-------------+--------+---------+---------+
| local |             | dir    | CREATED | 3       |
+-------+-------------+--------+---------+---------+

Networks

+---------+----------+---------+-------------+---------+---------+
|  NAME   |   TYPE   | MANAGED | DESCRIPTION | USED BY |  STATE  |
+---------+----------+---------+-------------+---------+---------+
| eth0    | physical | NO      |             | 0       |         |
+---------+----------+---------+-------------+---------+---------+
| lxdfan0 | bridge   | YES     |             | 6       | CREATED |
+---------+----------+---------+-------------+---------+---------+

Projects

+-------------------+--------+----------+-----------------+---------+
|       NAME        | IMAGES | PROFILES | STORAGE VOLUMES | USED BY |
+-------------------+--------+----------+-----------------+---------+
| default (current) | YES    | YES      | YES             | 8       |
+-------------------+--------+----------+-----------------+---------+

Profiles

+---------+---------+
|  NAME   | USED BY |
+---------+---------+
| default | 6       |
+---------+---------+

Default profile

config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    network: lxdfan0
    type: nic
  root:
    path: /
    pool: local
    type: disk
name: default
used_by:
- /1.0/instances/test1
- /1.0/instances/test2
- /1.0/instances/test3
- /1.0/instances/test4
- /1.0/instances/test5
- /1.0/instances/test6

Cluster

+----------+-----------------------------+----------+--------+-------------------+--------------+
|   NAME   |             URL             | DATABASE | STATE  |      MESSAGE      | ARCHITECTURE |
+----------+-----------------------------+----------+--------+-------------------+--------------+
| machine1 | https://172.26.140.101:8443 | YES      | ONLINE | fully operational | x86_64       |
+----------+-----------------------------+----------+--------+-------------------+--------------+
| machine2 | https://172.26.140.102:8443 | YES      | ONLINE | fully operational | x86_64       |
+----------+-----------------------------+----------+--------+-------------------+--------------+
| machine3 | https://172.26.140.103:8443 | YES      | ONLINE | fully operational | x86_64       |
+----------+-----------------------------+----------+--------+-------------------+--------------+
| machine4 | https://172.26.140.104:8443 | NO       | ONLINE | fully operational | x86_64       |
+----------+-----------------------------+----------+--------+-------------------+--------------+
| machine5 | https://172.26.140.105:8443 | NO       | ONLINE | fully operational | x86_64       |
+----------+-----------------------------+----------+--------+-------------------+--------------+

Kernel log (last 50 lines)

[1473066.443291] audit: type=1400 audit(1587134600.306:212): apparmor="STATUS" operation="profile_load" label="lxd-test5_</var/snap/lxd/common/lxd>//&:lxd-test5_<var-snap-lxd-common-lxd>:unconfined" name="lxc-container-default-with-nesting" pid=16234 comm="apparmor_parser"
[1475682.752033] lxdfan0: port 3(veth565cdb83) entered disabled state
[1475682.835457] lxdfan0: port 4(vethfe49667d) entered disabled state
[1475682.954640] lxdfan0: port 3(veth565cdb83) entered disabled state
[1475682.960042] device veth565cdb83 left promiscuous mode
[1475682.960046] lxdfan0: port 3(veth565cdb83) entered disabled state
[1475683.027134] lxdfan0: port 4(vethfe49667d) entered disabled state
[1475683.032376] device vethfe49667d left promiscuous mode
[1475683.032380] lxdfan0: port 4(vethfe49667d) entered disabled state
[1475683.930248] kauditd_printk_skb: 6 callbacks suppressed
[1475683.930249] audit: type=1400 audit(1587137217.876:219): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="lxd-test1_</var/snap/lxd/common/lxd>" pid=18482 comm="apparmor_parser"
[1475683.979657] audit: type=1400 audit(1587137217.924:220): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="lxd-test5_</var/snap/lxd/common/lxd>" pid=18498 comm="apparmor_parser"
[1475683.988806] device lxdfan0-fan left promiscuous mode
[1475683.988846] lxdfan0: port 2(lxdfan0-fan) entered disabled state
[1475684.000260] device lxdfan0-mtu left promiscuous mode
[1475684.000284] lxdfan0: port 1(lxdfan0-mtu) entered disabled state
[1475685.948019] new mount options do not match the existing superblock, will be ignored
[1475687.708346] lxdfan0: port 1(lxdfan0-mtu) entered blocking state
[1475687.708348] lxdfan0: port 1(lxdfan0-mtu) entered disabled state
[1475687.708519] device lxdfan0-mtu entered promiscuous mode
[1475687.716170] lxdfan0: port 1(lxdfan0-mtu) entered blocking state
[1475687.716172] lxdfan0: port 1(lxdfan0-mtu) entered forwarding state
[1475687.790000] lxdfan0: port 2(lxdfan0-fan) entered blocking state
[1475687.790003] lxdfan0: port 2(lxdfan0-fan) entered disabled state
[1475687.790078] device lxdfan0-fan entered promiscuous mode
[1475687.792635] lxdfan0: port 2(lxdfan0-fan) entered blocking state
[1475687.792637] lxdfan0: port 2(lxdfan0-fan) entered forwarding state
[1475688.352773] lxdfan0: port 3(vethea777b54) entered blocking state
[1475688.352775] lxdfan0: port 3(vethea777b54) entered disabled state
[1475688.352868] device vethea777b54 entered promiscuous mode
[1475688.352892] lxdfan0: port 3(vethea777b54) entered blocking state
[1475688.352894] lxdfan0: port 3(vethea777b54) entered forwarding state
[1475688.540983] audit: type=1400 audit(1587137222.488:221): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-test1_</var/snap/lxd/common/lxd>" pid=18820 comm="apparmor_parser"
[1475688.619793] eth0: renamed from vethe4a5edc6
[1475688.777725] lxdfan0: port 4(veth6d37bf55) entered blocking state
[1475688.777728] lxdfan0: port 4(veth6d37bf55) entered disabled state
[1475688.777814] device veth6d37bf55 entered promiscuous mode
[1475688.777838] lxdfan0: port 4(veth6d37bf55) entered blocking state
[1475688.777840] lxdfan0: port 4(veth6d37bf55) entered forwarding state
[1475689.095380] audit: type=1400 audit(1587137223.040:222): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-test5_</var/snap/lxd/common/lxd>" pid=19058 comm="apparmor_parser"
[1475689.175856] eth0: renamed from veth58040a10
[1475689.449643] audit: type=1400 audit(1587137223.396:223): apparmor="STATUS" operation="profile_load" label="lxd-test1_</var/snap/lxd/common/lxd>//&:lxd-test1_<var-snap-lxd-common-lxd>:unconfined" name="lxc-container-default" pid=19248 comm="apparmor_parser"
[1475689.449647] audit: type=1400 audit(1587137223.396:224): apparmor="STATUS" operation="profile_load" label="lxd-test1_</var/snap/lxd/common/lxd>//&:lxd-test1_<var-snap-lxd-common-lxd>:unconfined" name="lxc-container-default-cgns" pid=19248 comm="apparmor_parser"
[1475689.449650] audit: type=1400 audit(1587137223.396:225): apparmor="STATUS" operation="profile_load" label="lxd-test1_</var/snap/lxd/common/lxd>//&:lxd-test1_<var-snap-lxd-common-lxd>:unconfined" name="lxc-container-default-with-mounting" pid=19248 comm="apparmor_parser"
[1475689.449653] audit: type=1400 audit(1587137223.396:226): apparmor="STATUS" operation="profile_load" label="lxd-test1_</var/snap/lxd/common/lxd>//&:lxd-test1_<var-snap-lxd-common-lxd>:unconfined" name="lxc-container-default-with-nesting" pid=19248 comm="apparmor_parser"
[1475689.452694] audit: type=1400 audit(1587137223.400:227): apparmor="STATUS" operation="profile_load" label="lxd-test1_</var/snap/lxd/common/lxd>//&:lxd-test1_<var-snap-lxd-common-lxd>:unconfined" name="/usr/bin/lxc-start" pid=19255 comm="apparmor_parser"
[1475689.452809] audit: type=1400 audit(1587137223.400:228): apparmor="STATUS" operation="profile_load" label="lxd-test1_</var/snap/lxd/common/lxd>//&:lxd-test1_<var-snap-lxd-common-lxd>:unconfined" name="/sbin/dhclient" pid=19249 comm="apparmor_parser"
[1475689.452812] audit: type=1400 audit(1587137223.400:229): apparmor="STATUS" operation="profile_load" label="lxd-test1_</var/snap/lxd/common/lxd>//&:lxd-test1_<var-snap-lxd-common-lxd>:unconfined" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=19249 comm="apparmor_parser"
[1475689.452815] audit: type=1400 audit(1587137223.400:230): apparmor="STATUS" operation="profile_load" label="lxd-test1_</var/snap/lxd/common/lxd>//&:lxd-test1_<var-snap-lxd-common-lxd>:unconfined" name="/usr/lib/NetworkManager/nm-dhcp-helper" pid=19249 comm="apparmor_parser"
[1475689.452818] audit: type=1400 audit(1587137223.400:231): apparmor="STATUS" operation="profile_load" label="lxd-test1_</var/snap/lxd/common/lxd>//&:lxd-test1_<var-snap-lxd-common-lxd>:unconfined" name="/usr/lib/connman/scripts/dhclient-script" pid=19249 comm="apparmor_parser"

Daemon log (last 50 lines)

t=2020-04-17T23:27:01+0800 lvl=info msg=" - cgroup layout: hybrid"
t=2020-04-17T23:27:01+0800 lvl=warn msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - shiftfs support: disabled"
t=2020-04-17T23:27:01+0800 lvl=info msg="Initializing local database"
t=2020-04-17T23:27:01+0800 lvl=info msg="Starting /dev/lxd handler:"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2020-04-17T23:27:01+0800 lvl=info msg="REST API daemon:"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - binding Unix socket" inherited=true socket=/var/snap/lxd/common/lxd/unix.socket
t=2020-04-17T23:27:01+0800 lvl=info msg=" - binding TCP socket" socket=172.26.140.101:8443
t=2020-04-17T23:27:01+0800 lvl=info msg="Initializing global database"
t=2020-04-17T23:27:01+0800 lvl=warn msg="Dqlite: server unavailable err=failed to establish network connection: 503 Service Unavailable address=172.26.140.101:8443 attempt=0"
t=2020-04-17T23:27:01+0800 lvl=info msg="Firewall loaded driver \"xtables\""
t=2020-04-17T23:27:01+0800 lvl=info msg="Initializing storage pools"
t=2020-04-17T23:27:01+0800 lvl=info msg="Initializing daemon storage mounts"
t=2020-04-17T23:27:01+0800 lvl=info msg="Initializing networks"
t=2020-04-17T23:27:01+0800 lvl=info msg="Pruning leftover image files"
t=2020-04-17T23:27:01+0800 lvl=info msg="Done pruning leftover image files"
t=2020-04-17T23:27:01+0800 lvl=info msg="Loading daemon configuration"
t=2020-04-17T23:27:02+0800 lvl=info msg="Pruning expired images"
t=2020-04-17T23:27:02+0800 lvl=info msg="Done pruning expired images"
t=2020-04-17T23:27:02+0800 lvl=info msg="Pruning expired instance backups"
t=2020-04-17T23:27:02+0800 lvl=info msg="Done pruning expired instance backups"
t=2020-04-17T23:27:02+0800 lvl=info msg="Updating instance types"
t=2020-04-17T23:27:02+0800 lvl=info msg="Done updating instance types"
t=2020-04-17T23:27:02+0800 lvl=info msg="Expiring log files"
t=2020-04-17T23:27:02+0800 lvl=info msg="Updating images"
t=2020-04-17T23:27:02+0800 lvl=info msg="Done expiring log files"
t=2020-04-17T23:27:02+0800 lvl=info msg="Done updating images"
t=2020-04-17T23:27:02+0800 lvl=info msg="Starting container" action=start created=2020-04-17T22:31:45+0800 ephemeral=false name=test1 project=default stateful=false used=2020-04-17T22:31:59+0800
t=2020-04-17T23:27:02+0800 lvl=info msg="Started container" action=start created=2020-04-17T22:31:45+0800 ephemeral=false name=test1 project=default stateful=false used=2020-04-17T22:31:59+0800
t=2020-04-17T23:27:02+0800 lvl=info msg="Starting container" action=start created=2020-04-17T22:43:05+0800 ephemeral=false name=test5 project=default stateful=false used=2020-04-17T22:43:19+0800
t=2020-04-17T23:27:03+0800 lvl=info msg="Started container" action=start created=2020-04-17T22:43:05+0800 ephemeral=false name=test5 project=default stateful=false used=2020-04-17T22:43:19+0800
t=2020-04-17T23:27:08+0800 lvl=info msg="Refreshing forkdns peers for lxdfan0"
t=2020-04-17T23:27:30+0800 lvl=warn msg="Dqlite client proxy Unix -> TLS: read unix @->@00175: use of closed network connection"
t=2020-04-17T23:27:30+0800 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@00173: use of closed network connection"
t=2020-04-17T23:27:30+0800 lvl=warn msg="Dqlite client proxy Unix -> TLS: read unix @->@00174: use of closed network connection"
t=2020-04-17T23:27:31+0800 lvl=warn msg="Failed to get events from node 172.26.140.102:8443: Unable to connect to: 172.26.140.102:8443"
t=2020-04-17T23:27:32+0800 lvl=warn msg="Failed to get events from node 172.26.140.102:8443: Unable to connect to: 172.26.140.102:8443"
t=2020-04-17T23:28:07+0800 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@00173: use of closed network connection"
t=2020-04-17T23:28:07+0800 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@00173: use of closed network connection"
t=2020-04-17T23:28:07+0800 lvl=warn msg="Dqlite client proxy Unix -> TLS: read unix @->@00233: use of closed network connection"
t=2020-04-17T23:28:07+0800 lvl=warn msg="Failed to get events from node 172.26.140.103:8443: Unable to connect to: 172.26.140.103:8443"
t=2020-04-17T23:28:08+0800 lvl=warn msg="Failed to get events from node 172.26.140.103:8443: Unable to connect to: 172.26.140.103:8443"
t=2020-04-17T23:28:09+0800 lvl=warn msg="Failed to get events from node 172.26.140.103:8443: Unable to connect to: 172.26.140.103:8443"
t=2020-04-17T23:28:22+0800 lvl=warn msg="Dqlite client proxy Unix -> TLS: read unix @->@00235: use of closed network connection"
t=2020-04-17T23:28:22+0800 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@00173: use of closed network connection"
t=2020-04-17T23:28:22+0800 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@00173: use of closed network connection"
t=2020-04-17T23:28:23+0800 lvl=warn msg="Failed to get events from node 172.26.140.105:8443: Unable to connect to: 172.26.140.105:8443"
t=2020-04-17T23:28:24+0800 lvl=warn msg="Failed to get events from node 172.26.140.105:8443: Unable to connect to: 172.26.140.105:8443"
t=2020-04-17T23:28:25+0800 lvl=warn msg="Failed to get events from node 172.26.140.105:8443: Unable to connect to: 172.26.140.105:8443"

Systemd log (last 50 lines)

-- Logs begin at Wed 2019-12-25 16:19:53 CST, end at Fri 2020-04-17 23:39:55 CST. --
Apr 17 23:26:59 machine1 lxd.daemon[18200]: => Cleaning up namespaces
Apr 17 23:26:59 machine1 lxd.daemon[18200]: => All done
Apr 17 23:26:59 machine1 systemd[1]: Stopped Service for snap application lxd.daemon.
Apr 17 23:26:59 machine1 systemd[1]: Started Service for snap application lxd.daemon.
Apr 17 23:26:59 machine1 lxd.daemon[18566]: => Preparing the system (14663)
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Loading snap configuration
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Setting up mntns symlink (mnt:[4026532233])
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Setting up kmod wrapper
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Preparing /boot
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Preparing a clean copy of /run
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Preparing a clean copy of /etc
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Setting up ceph configuration
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Setting up LVM configuration
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Rotating logs
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Setting up ZFS (0.7)
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Escaping the systemd cgroups
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ====> Detected cgroup V1
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Escaping the systemd process resource limits
Apr 17 23:26:59 machine1 lxd.daemon[18566]: ==> Disabling shiftfs on this kernel (auto)
Apr 17 23:26:59 machine1 lxd.daemon[18566]: => Starting LXCFS
Apr 17 23:26:59 machine1 lxd.daemon[18566]: Running constructor lxcfs_init to reload liblxcfs
Apr 17 23:26:59 machine1 lxd.daemon[18566]: mount namespace: 4
Apr 17 23:26:59 machine1 lxd.daemon[18566]: hierarchies:
Apr 17 23:26:59 machine1 lxd.daemon[18566]:   0: fd:   5:
Apr 17 23:26:59 machine1 lxd.daemon[18566]:   1: fd:   6: name=systemd
Apr 17 23:26:59 machine1 lxd.daemon[18566]:   2: fd:   7: devices
Apr 17 23:26:59 machine1 lxd.daemon[18566]:   3: fd:   8: cpuset
Apr 17 23:26:59 machine1 lxd.daemon[18566]:   4: fd:   9: blkio
Apr 17 23:26:59 machine1 lxd.daemon[18566]:   5: fd:  10: cpu,cpuacct
Apr 17 23:26:59 machine1 lxd.daemon[18566]:   6: fd:  11: pids
Apr 17 23:26:59 machine1 lxd.daemon[18566]:   7: fd:  12: rdma
Apr 17 23:26:59 machine1 lxd.daemon[18566]:   8: fd:  13: net_cls,net_prio
Apr 17 23:26:59 machine1 lxd.daemon[18566]:   9: fd:  14: hugetlb
Apr 17 23:26:59 machine1 lxd.daemon[18566]:  10: fd:  15: memory
Apr 17 23:26:59 machine1 lxd.daemon[18566]:  11: fd:  16: freezer
Apr 17 23:26:59 machine1 lxd.daemon[18566]:  12: fd:  17: perf_event
Apr 17 23:26:59 machine1 lxd.daemon[18566]: api_extensions:
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - cgroups
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - sys_cpu_online
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - proc_cpuinfo
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - proc_diskstats
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - proc_loadavg
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - proc_meminfo
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - proc_stat
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - proc_swaps
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - proc_uptime
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - shared_pidns
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - cpuview_daemon
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - loadavg_daemon
Apr 17 23:26:59 machine1 lxd.daemon[18566]: - pidfds

Issue description

A brief description of the problem. Should include what you were
attempting to do, what you did, what happened and what you expected to
see happen.

I create a cluster using 5 machines, then I launch the instance using the command below one after another immediately if the last one finish.

lxc launch ubuntu:18.04 test1
lxc launch ubuntu:18.04 test2
lxc launch ubuntu:18.04 test3
lxc launch ubuntu:18.04 test4

After several times, maybe 3~4 times, I get the erro :

Error: Failed instance creation: Locate image 2cfc5a5567b8d74c0986f3d8a77a2a78e58fe22ea9abd2693112031f85afa1a1 in the cluster: image not available on any online node

I’ve tried several times, the isuue is not self-recovery.
Then I try the command in other node, it’s the same as in node1.

Steps to reproduce

  1. Step one
    Create a cluster.
  2. Step two
    Launch instance one after aonther immediately.

Information to attach

  • Any relevant kernel output (dmesg)
  • Container log (lxc info NAME --show-log)
  • Container configuration (lxc config show NAME --expanded)
  • Main daemon log (at /var/log/lxd/lxd.log or /var/snap/lxd/common/lxd/logs/lxd.log)
holytiny@machine1:~$ sudo cat  /var/snap/lxd/common/lxd/logs/lxd.log
t=2020-04-17T23:27:01+0800 lvl=info msg="LXD 4.0.0 is starting in normal mode" path=/var/snap/lxd/common/lxd
t=2020-04-17T23:27:01+0800 lvl=info msg="Kernel uid/gid map:"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - u 0 0 4294967295"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - g 0 0 4294967295"
t=2020-04-17T23:27:01+0800 lvl=info msg="Configured LXD uid/gid map:"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - u 0 1000000 1000000000"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - g 0 1000000 1000000000"
t=2020-04-17T23:27:01+0800 lvl=info msg="Kernel features:"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - netnsid-based network retrieval: no"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - uevent injection: no"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - seccomp listener: no"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - seccomp listener continue syscalls: no"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - unprivileged file capabilities: yes"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - cgroup layout: hybrid"
t=2020-04-17T23:27:01+0800 lvl=warn msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - shiftfs support: disabled"
t=2020-04-17T23:27:01+0800 lvl=info msg="Initializing local database"
t=2020-04-17T23:27:01+0800 lvl=info msg="Starting /dev/lxd handler:"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2020-04-17T23:27:01+0800 lvl=info msg="REST API daemon:"
t=2020-04-17T23:27:01+0800 lvl=info msg=" - binding Unix socket" inherited=true socket=/var/snap/lxd/common/lxd/unix.socket
t=2020-04-17T23:27:01+0800 lvl=info msg=" - binding TCP socket" socket=172.26.140.101:8443
t=2020-04-17T23:27:01+0800 lvl=info msg="Initializing global database"
t=2020-04-17T23:27:01+0800 lvl=warn msg="Dqlite: server unavailable err=failed to establish network connection: 503 Service Unavailable address=172.26.140.101:8443 attempt=0"
t=2020-04-17T23:27:01+0800 lvl=info msg="Firewall loaded driver \"xtables\""
t=2020-04-17T23:27:01+0800 lvl=info msg="Initializing storage pools"
t=2020-04-17T23:27:01+0800 lvl=info msg="Initializing daemon storage mounts"
t=2020-04-17T23:27:01+0800 lvl=info msg="Initializing networks"
t=2020-04-17T23:27:01+0800 lvl=info msg="Pruning leftover image files"
t=2020-04-17T23:27:01+0800 lvl=info msg="Done pruning leftover image files"
t=2020-04-17T23:27:01+0800 lvl=info msg="Loading daemon configuration"
t=2020-04-17T23:27:02+0800 lvl=info msg="Pruning expired images"
t=2020-04-17T23:27:02+0800 lvl=info msg="Done pruning expired images"
t=2020-04-17T23:27:02+0800 lvl=info msg="Pruning expired instance backups"
t=2020-04-17T23:27:02+0800 lvl=info msg="Done pruning expired instance backups"
t=2020-04-17T23:27:02+0800 lvl=info msg="Updating instance types"
t=2020-04-17T23:27:02+0800 lvl=info msg="Done updating instance types"
t=2020-04-17T23:27:02+0800 lvl=info msg="Expiring log files"
t=2020-04-17T23:27:02+0800 lvl=info msg="Updating images"
t=2020-04-17T23:27:02+0800 lvl=info msg="Done expiring log files"
t=2020-04-17T23:27:02+0800 lvl=info msg="Done updating images"
t=2020-04-17T23:27:02+0800 lvl=info msg="Starting container" action=start created=2020-04-17T22:31:45+0800 ephemeral=false name=test1 project=default stateful=false used=2020-04-17T22:31:59+0800
t=2020-04-17T23:27:02+0800 lvl=info msg="Started container" action=start created=2020-04-17T22:31:45+0800 ephemeral=false name=test1 project=default stateful=false used=2020-04-17T22:31:59+0800
t=2020-04-17T23:27:02+0800 lvl=info msg="Starting container" action=start created=2020-04-17T22:43:05+0800 ephemeral=false name=test5 project=default stateful=false used=2020-04-17T22:43:19+0800
t=2020-04-17T23:27:03+0800 lvl=info msg="Started container" action=start created=2020-04-17T22:43:05+0800 ephemeral=false name=test5 project=default stateful=false used=2020-04-17T22:43:19+0800
t=2020-04-17T23:27:08+0800 lvl=info msg="Refreshing forkdns peers for lxdfan0"
t=2020-04-17T23:27:30+0800 lvl=warn msg="Dqlite client proxy Unix -> TLS: read unix @->@00175: use of closed network connection"
t=2020-04-17T23:27:30+0800 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@00173: use of closed network connection"
t=2020-04-17T23:27:30+0800 lvl=warn msg="Dqlite client proxy Unix -> TLS: read unix @->@00174: use of closed network connection"
t=2020-04-17T23:27:31+0800 lvl=warn msg="Failed to get events from node 172.26.140.102:8443: Unable to connect to: 172.26.140.102:8443"
t=2020-04-17T23:27:32+0800 lvl=warn msg="Failed to get events from node 172.26.140.102:8443: Unable to connect to: 172.26.140.102:8443"
t=2020-04-17T23:28:07+0800 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@00173: use of closed network connection"
t=2020-04-17T23:28:07+0800 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@00173: use of closed network connection"
t=2020-04-17T23:28:07+0800 lvl=warn msg="Dqlite client proxy Unix -> TLS: read unix @->@00233: use of closed network connection"
t=2020-04-17T23:28:07+0800 lvl=warn msg="Failed to get events from node 172.26.140.103:8443: Unable to connect to: 172.26.140.103:8443"
t=2020-04-17T23:28:08+0800 lvl=warn msg="Failed to get events from node 172.26.140.103:8443: Unable to connect to: 172.26.140.103:8443"
t=2020-04-17T23:28:09+0800 lvl=warn msg="Failed to get events from node 172.26.140.103:8443: Unable to connect to: 172.26.140.103:8443"
t=2020-04-17T23:28:22+0800 lvl=warn msg="Dqlite client proxy Unix -> TLS: read unix @->@00235: use of closed network connection"
t=2020-04-17T23:28:22+0800 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@00173: use of closed network connection"
t=2020-04-17T23:28:22+0800 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@00173: use of closed network connection"
t=2020-04-17T23:28:23+0800 lvl=warn msg="Failed to get events from node 172.26.140.105:8443: Unable to connect to: 172.26.140.105:8443"
t=2020-04-17T23:28:24+0800 lvl=warn msg="Failed to get events from node 172.26.140.105:8443: Unable to connect to: 172.26.140.105:8443"
t=2020-04-17T23:28:25+0800 lvl=warn msg="Failed to get events from node 172.26.140.105:8443: Unable to connect to: 172.26.140.105:8443"
holytiny@machine1:~$ ping 172.26.140.105
PING 172.26.140.105 (172.26.140.105) 56(84) bytes of data.
64 bytes from 172.26.140.105: icmp_seq=1 ttl=64 time=0.173 ms
64 bytes from 172.26.140.105: icmp_seq=2 ttl=64 time=0.122 ms
^C
--- 172.26.140.105 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1021ms
rtt min/avg/max/mdev = 0.122/0.147/0.173/0.028 ms
  • Output of the client with --debug
  • Output of the daemon with --debug (alternatively output of lxc monitor while reproducing the issue)

lxc image info ubuntu-18.04

Fingerprint: 2cfc5a5567b8d74c0986f3d8a77a2a78e58fe22ea9abd2693112031f85afa1a1
Size: 179.02MB
Architecture: x86_64
Type: container
Public: no
Timestamps:
    Created: 2020/04/07 00:00 UTC
    Uploaded: 2020/04/17 14:26 UTC
    Expires: 2023/04/26 00:00 UTC
    Last used: 2020/04/17 14:43 UTC
Properties:
    architecture: amd64
    description: ubuntu 18.04 LTS amd64 (release) (20200407)
    label: release
    os: ubuntu
    release: bionic
    serial: 20200407
    type: squashfs
    version: 18.04
Aliases:
    - ubuntu-18.04
Cached: no
Auto update: disabled
Source:
    Server: https://cloud-images.ubuntu.com/releases
    Protocol: simplestreams
    Alias: 18.04
Profiles:
    - default

Sorry for sudo.
Test again.

Oh, sorry I forgot to append this info.
Because the nodes of the cluster remained offline after updating to 4.0, I executed the command below before I did my test.

sudo snap refresh lxd && lxc config unset cluster.offline_threshold

What does lxc cluster list show you?

If they’re considered offline, then that’s the problem, LXD does internal image transfers but if it thinks they’re all offline, it can’t do that.

It just seems ok.

Network seems ok too.
I run the follow commands: