ykazakov
(Yevgeny Kazakov)
March 21, 2026, 10:14am
1
I am trying to configure an IncusOS cluster with the cluster network cluster.https_address that is different from the API network core.https_address. I am trying to follow this tutorial , but have some problems with validation of x509 certificates. All servers are configured with one public IPv4 network 10.0.0.1/24 that is used for Incus clients and another private network fd00:10::1/64 that I want to use for internal cluster communication. (The server names and IP addresses have been modified)
% incus cluster join my-cluster: server2:
What IP address or DNS name should be used to reach this server? [default=10.0.0.12]: fd00:10::12
What member name should be used to identify this server in the cluster? [default=4c4c4544-0043-5410-8033-c8c04f503034]: server2
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Error connecting to existing cluster member "[fd00:10::11]:8443": Get "https://[fd00:10::11]:8443": Unable to connect to: [fd00:10::11]:8443 ([dial tcp [fd00:10::11]:8443: i/o timeout])
Error: Failed to join cluster: Failed to setup cluster trust: Failed to add server cert to cluster: Post "https://[fd00:10::11]:8443/1.0/certificates": tls: failed to verify certificate: x509: cannot validate certificate for fd00:10::11 because it doesn't contain any IP SANs
How do I resolve this problem? Do I need to provide certificates for the internal addresses with the installation seed?
Initially, I was trying to crate a cluster using operations center , but this also fails when I try to use a network with role cluster that is different from the network with role management. I am not sure how else to specify which network should be used for cluster.https_address.
Since I cannot change cluster.https_address after the cluster is created, I need to provide the final settings during creating of the cluster.
ykazakov
(Yevgeny Kazakov)
March 21, 2026, 10:30am
2
So, on my first node server1 I tried to change core.https_address from the default value :8443 to the IP of the node 10.0.0.11:8443 and now I get a different error when trying to join the cluster:
Error: Failed to join cluster: Failed to setup cluster trust: Failed to add server cert to cluster: Post "https://[fd00:10::11]:8443/1.0/certificates": Unable to connect to: [fd00:10::11]:8443 ([dial tcp [fd00:10::11]:8443: connect: connection refused])
ykazakov
(Yevgeny Kazakov)
March 21, 2026, 10:49am
3
OK, I restarted the incus application on the first node:
incus admin os application restart incus
and I now get the original error message:
tls: failed to verify certificate: x509: cannot validate certificate for fd00:10::11 because it doesn't contain any IP SANs
stgraber
(Stéphane Graber)
March 21, 2026, 5:11pm
4
Can you run incus cluster list my-cluster:?
I’ve seen that join error in the past when the CLI doesn’t have a direct route to the joining server’s address before, but that didn’t actually prevent it from joining for me.
ykazakov
(Yevgeny Kazakov)
March 21, 2026, 9:18pm
5
stgraber:
when the CLI doesn’t have a direct route to the joining server’s address
I see. Does it mean that cluster.https_address of server1 must be reachable from the client from which I run the incus commands? In my case, the network fd00:10::1/64 is completely isolated. (It is managed by a switch without external internet connectivity).
stgraber:
Can you run incus cluster list my-cluster:?
I already wiped my cluster. I now repeated the installation from scratch and I think, I managed to get the cluster formed despite the final error message:
% incus config set server1: cluster.https_address=[fd00:10::11]:8443 # internal network
% incus cluster enable server1: server1
Clustering enabled
% incus remote add my-cluster 10.0.0.11:8443 # use address reachable from the client
Certificate fingerprint: 198b620cb1b2f3b6aae5085c9e83bd8204ca110ab55091b9e496d55c32514866
ok (y/n/[fingerprint])? y
% incus remote rm server1
% incus cluster list my-cluster:
+---------+----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| NAME | URL | ROLES | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATUS | MESSAGE |
+---------+----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| server1 | https://[fd00:10::11]:8443 | database-leader | x86_64 | default | | ONLINE | Fully operational |
| | | database | | | | | |
+---------+----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
% incus cluster join my-cluster: server2:
What IP address or DNS name should be used to reach this server? [default=10.0.0.12]: fd00:10::12 # !! This will be set as `cluster.https_address` of `server2` !!
What member name should be used to identify this server in the cluster? [default=4c4c4544-0044-5410-8033-b2c04f503034]: server2
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Error connecting to existing cluster member "[fd00:10::11]:8443": Get "https://[fd00:10::11]:8443": Unable to connect to: [fd00:10::11]:8443 ([dial tcp [fd00:10::11]:8443: i/o timeout])
% incus cluster list my-cluster:
+---------+----------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| NAME | URL | ROLES | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATUS | MESSAGE |
+---------+----------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| server1 | https://[fd00:10::11]:8443 | database-leader | x86_64 | default | | ONLINE | Fully operational |
| | | database | | | | | |
+---------+----------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| server2 | https://[fd00:10::12]:8443 | database-standby | x86_64 | default | | ONLINE | Fully operational |
+---------+----------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
% incus cluster join my-cluster: server3:
# ...
Despite the error, the cluster appears to be operational: I was managed to launch containers after adding the local ZFS pool and the bridge network.
Last time I set cluster.https_address of server2 directly before joining the cluster, which probably resulted in problems with certificates.
Would it still be possible to make incus cluster join to work without errors when cluster.https_address is not reachable from the incus client? Also, I guess, setting up a cluster from the operations center fails because this address is not reachable?
stgraber
(Stéphane Graber)
March 22, 2026, 11:27pm
6
Yeah, that lines up with what I’ve seen before.
Basically the CLI attempt to confirm the cluster is functional at the end, but it can’t connect to the new server anymore because its certificate as changed to the cluster one.
But that’s happening after everything else succeeded so the cluster is still perfectly fine.
We don’t include any name or IP addresses in our certificates, we perform exact certificate matching instead and ignore all fields. You’re getting that kind of weird error when they’re not a perfect match, which here would likely be because the server you joined is now responding with the cluster-wide certificate.
ykazakov
(Yevgeny Kazakov)
March 23, 2026, 7:45am
7
stgraber:
Basically the CLI attempt to confirm the cluster is functional at the end, but it can’t connect to the new server anymore because its certificate as changed to the cluster one.
I think, in my last setup, the error is not due to the certificates but because the client tries to connect to server1 using its cluster.https_address, which is unreachable from the client since the internal network is isolated .
It is not clear to me why Incus is trying to contact individual cluster members directly after the cluster is formed. Wouldn’t it make more sense to confirm that the cluster is functional using the cluster remote address?
stgraber
(Stéphane Graber)
March 23, 2026, 10:45pm
8
I think it’s basically a race condition. Part of the cluster joining is sending a request and then waiting for an operation to complete. If the request makes enough progress before we get to attach to the operation, we get the connection error.
I’ve tried to reproduce it locally with some VMs and haven’t been able to, likely because network latency is low enough to hide the problem.
ykazakov
(Yevgeny Kazakov)
March 24, 2026, 12:29pm
9
I repeated the steps a few times and the error is consistently triggered in my case, also when using VMs. Are you sure that the internal network was not reachable from the client? If it is reachable, there is no error.
I described the detailed steps of my setup here:
opened 12:19PM - 24 Mar 26 UTC
### Is there an existing issue for this?
- [x] There is no existing issue for t… his bug
### Is this happening on an up to date version of Incus?
- [x] This is happening on a supported version of Incus
### Incus system details
```yaml
config:
cluster.https_address: 20.0.0.11:8443
core.https_address: :8443
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- network_sriov
- console
- restrict_dev_incus
- migration_pre_copy
- infiniband
- dev_incus_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- dev_incus_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- backup_compression
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- images_all_projects
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
- cluster_member_state
- instances_placement_scriptlet
- storage_pool_source_wipe
- zfs_block_mode
- instance_generation_id
- disk_io_cache
- amd_sev
- storage_pool_loop_resize
- migration_vm_live
- ovn_nic_nesting
- oidc
- network_ovn_l3only
- ovn_nic_acceleration_vdpa
- cluster_healing
- instances_state_total
- auth_user
- security_csm
- instances_rebuild
- numa_cpu_placement
- custom_volume_iso
- network_allocations
- zfs_delegate
- storage_api_remote_volume_snapshot_copy
- operations_get_query_all_projects
- metadata_configuration
- syslog_socket
- event_lifecycle_name_and_project
- instances_nic_limits_priority
- disk_initial_volume_configuration
- operation_wait
- image_restriction_privileged
- cluster_internal_custom_volume_copy
- disk_io_bus
- storage_cephfs_create_missing
- instance_move_config
- ovn_ssl_config
- certificate_description
- disk_io_bus_virtio_blk
- loki_config_instance
- instance_create_start
- clustering_evacuation_stop_options
- boot_host_shutdown_action
- agent_config_drive
- network_state_ovn_lr
- image_template_permissions
- storage_bucket_backup
- storage_lvm_cluster
- shared_custom_block_volumes
- auth_tls_jwt
- oidc_claim
- device_usb_serial
- numa_cpu_balanced
- image_restriction_nesting
- network_integrations
- instance_memory_swap_bytes
- network_bridge_external_create
- network_zones_all_projects
- storage_zfs_vdev
- container_migration_stateful
- profiles_all_projects
- instances_scriptlet_get_instances
- instances_scriptlet_get_cluster_members
- instances_scriptlet_get_project
- network_acl_stateless
- instance_state_started_at
- networks_all_projects
- network_acls_all_projects
- storage_buckets_all_projects
- resources_load
- instance_access
- project_access
- projects_force_delete
- resources_cpu_flags
- disk_io_bus_cache_filesystem
- instance_oci
- clustering_groups_config
- instances_lxcfs_per_instance
- clustering_groups_vm_cpu_definition
- disk_volume_subpath
- projects_limits_disk_pool
- network_ovn_isolated
- qemu_raw_qmp
- network_load_balancer_health_check
- oidc_scopes
- network_integrations_peer_name
- qemu_scriptlet
- instance_auto_restart
- storage_lvm_metadatasize
- ovn_nic_promiscuous
- ovn_nic_ip_address_none
- instances_state_os_info
- network_load_balancer_state
- instance_nic_macvlan_mode
- storage_lvm_cluster_create
- network_ovn_external_interfaces
- instances_scriptlet_get_instances_count
- cluster_rebalance
- custom_volume_refresh_exclude_older_snapshots
- storage_initial_owner
- storage_live_migration
- instance_console_screenshot
- image_import_alias
- authorization_scriptlet
- console_force
- network_ovn_state_addresses
- network_bridge_acl_devices
- instance_debug_memory
- init_preseed_storage_volumes
- init_preseed_profile_project
- instance_nic_routed_host_address
- instance_smbios11
- api_filtering_extended
- acme_dns01
- security_iommu
- network_ipv4_dhcp_routes
- network_state_ovn_ls
- network_dns_nameservers
- acme_http01_port
- network_ovn_ipv4_dhcp_expiry
- instance_state_cpu_time
- network_io_bus
- disk_io_bus_usb
- storage_driver_linstor
- instance_oci_entrypoint
- network_address_set
- server_logging
- network_forward_snat
- memory_hotplug
- instance_nic_routed_host_tables
- instance_publish_split
- init_preseed_certificates
- custom_volume_sftp
- network_ovn_external_nic_address
- network_physical_gateway_hwaddr
- backup_s3_upload
- snapshot_manual_expiry
- resources_cpu_address_sizes
- disk_attached
- limits_memory_hotplug
- disk_wwn
- server_logging_webhook
- storage_driver_truenas
- container_disk_tmpfs
- instance_limits_oom
- backup_override_config
- network_ovn_tunnels
- init_preseed_cluster_groups
- usb_attached
- backup_iso
- instance_systemd_credentials
- cluster_group_usedby
- bpf_token_delegation
- file_storage_volume
- network_hwaddr_pattern
- storage_volume_full
- storage_bucket_full
- device_pci_firmware
- resources_serial
- ovn_nic_limits
- storage_lvmcluster_qcow2
- oidc_allowed_subnets
- file_delete_force
- nic_sriov_select_ext
- network_zones_dns_contact
- nic_attached_connected
- nic_sriov_security_trusted
- direct_backup
- instance_snapshot_disk_only_restore
- unix_hotplug_pci
- cluster_evacuating_restoring
- projects_restricted_image_servers
- storage_lvmcluster_size
- authorization_scriptlet_cert
- lvmcluster_remove_snapshots
- daemon_storage_logs
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: ccc2f65cdb189a3918f2fb7ad8d993af3ca6d3cdc033c23bccd635f23cbee98a
auth_user_method: tls
environment:
addresses:
- 20.0.0.11:8443
- 10.10.0.11:8443
architectures:
- x86_64
- i686
certificate: |
-----BEGIN CERTIFICATE-----
MIICVzCCAd6gAwIBAgIRAJA88iDDFmvmiua7MZuojuMwCgYIKoZIzj0EAwMwTzEZ
MBcGA1UEChMQTGludXggQ29udGFpbmVyczEyMDAGA1UEAwwpcm9vdEAyMTg3NDky
OS01OWRjLTQ2YjktYjYwOC05YmY1MGQyNzJiOTIwHhcNMjYwMzI0MTEzMTQzWhcN
MzYwMzIxMTEzMTQzWjBPMRkwFwYDVQQKExBMaW51eCBDb250YWluZXJzMTIwMAYD
VQQDDClyb290QDIxODc0OTI5LTU5ZGMtNDZiOS1iNjA4LTliZjUwZDI3MmI5MjB2
MBAGByqGSM49AgEGBSuBBAAiA2IABMniC+YJjkqnMTPp8Za1/IrKavazTdCpl0vA
jFlNMHMjJN9ETYEBX2flt1qJhFpJ+z3GmYy1gPEvA/fBnBR8qnF/NG5mQJYXld70
AYF5Gxwj2uF4C1M1Vd96vOmE/lUBE6N+MHwwDgYDVR0PAQH/BAQDAgWgMBMGA1Ud
JQQMMAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwRwYDVR0RBEAwPoIkMjE4NzQ5
MjktNTlkYy00NmI5LWI2MDgtOWJmNTBkMjcyYjkyhwR/AAABhxAAAAAAAAAAAAAA
AAAAAAABMAoGCCqGSM49BAMDA2cAMGQCMEDhsT6forHWR35EhxXh7jk2Y+ulCtM6
jnFlcrJsehekihGROt8Sxjf+RpEcXZpqGgIwZ0PoP0qQ1DY9adFcYQtlEAqbF4LA
nSy5iXMpK9vukpqHZv5NzuvZHh4Fefu2Q/8E
-----END CERTIFICATE-----
certificate_fingerprint: ce6a178e0f1447c7272e9b7fb71551fb707abb67a4d85d08da8616f030db91a6
driver: lxc | qemu
driver_version: 6.0.6 | 10.2.2
firewall: nftables
kernel: Linux
kernel_architecture: x86_64
kernel_features:
idmapped_mounts: "true"
netnsid_getifaddrs: "true"
seccomp_listener: "true"
seccomp_listener_continue: "true"
uevent_injection: "true"
unpriv_binfmt: "true"
unpriv_fscaps: "true"
kernel_version: 6.19.9-zabbly+
lxc_features:
cgroup2: "true"
core_scheduling: "true"
devpts_fd: "true"
idmapped_mounts_v2: "true"
mount_injection_file: "true"
network_gateway_device_route: "true"
network_ipvlan: "true"
network_l2proxy: "true"
network_phys_macvlan_mtu: "true"
network_veth_router: "true"
pidfd: "true"
seccomp_allow_deny_syntax: "true"
seccomp_notify: "true"
seccomp_proxy_send_notify_fd: "true"
os_name: IncusOS
os_version: "202603240012"
project: default
server: incus
server_clustered: true
server_event_mode: full-mesh
server_name: server1
server_pid: 1103
server_version: "6.22"
storage: ""
storage_version: ""
storage_supported_drivers:
- name: lvmcluster
version: 2.03.31(2) (2025-02-27) / 1.02.205 (2025-02-27) / 4.50.0
remote: true
- name: zfs
version: 2.4.1-1
remote: false
- name: btrfs
version: "6.14"
remote: false
- name: dir
version: "1"
remote: false
- name: lvm
version: 2.03.31(2) (2025-02-27) / 1.02.205 (2025-02-27) / 4.50.0
remote: false
```
### Instance details
_No response_
### Instance log
_No response_
### Current behavior
I am trying to bootstrap an incus cluster from an external incus client (that is not one of the cluster members) using [this tutorial](https://linuxcontainers.org/incus-os/docs/main/tutorials/incus-cluster/) where `cluster.https_address` is on an isolated network switch, not reachable from the incus client.
Specifically, assume that my Incus hosts have two networks:
- the public (slow) network 10.10.0.0/24 using which all servers can be reached (from the incus client)
- the isolated (fast) network 20.0.0.0/24 (on a layer 2 network switch) that should be used for internal cluster traffic, not reachable from the client
Assuming that the incus client and servers have the following IP addresses:
```
incus ls -cns4t
+---------+---------+------------------------+-----------------+
| NAME | STATE | IPV4 | TYPE |
+---------+---------+------------------------+-----------------+
| client | RUNNING | 10.10.0.4 (enp5s0) | VIRTUAL-MACHINE |
+---------+---------+------------------------+-----------------+
| server1 | RUNNING | 20.0.0.11 (_vinternal) | VIRTUAL-MACHINE |
| | | 10.10.0.11 (_vuplink) | |
+---------+---------+------------------------+-----------------+
| server2 | RUNNING | 20.0.0.12 (_vinternal) | VIRTUAL-MACHINE |
| | | 10.10.0.12 (_vuplink) | |
+---------+---------+------------------------+-----------------+
| server3 | RUNNING | 20.0.0.13 (_vinternal) | VIRTUAL-MACHINE |
| | | 10.10.0.13 (_vuplink) | |
+---------+---------+------------------------+-----------------+
```
I am trying to form a cluster using the linked tutorial:
```
incus config set server1: cluster.https_address=20.0.0.11:8443
incus cluster enable server1: server1
> Clustering enabled
incus remote add my-cluster 10.10.0.11:8443
> Certificate fingerprint: b6ff5c85aed196eee22ae7e7364dd5dd0d6700a72e09ce8f44b6c1861bf818fe
> ok (y/n/[fingerprint])? y
incus cluster join my-cluster: server2:
> What IP address or DNS name should be used to reach this server? [default=20.0.0.12]:
> What member name should be used to identify this server in the cluster? [default=35ef33da-ee1d-4277-a245-aa2d6baa7be5]: server2
> All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
> Error connecting to existing cluster member "20.0.0.11:8443": Get "https://20.0.0.11:8443": Unable to connect to: 20.0.0.11:8443 ([dial tcp 20.0.0.11:8443: i/o timeout])
incus cluster ls my-cluster:
```
It looks like the incus client is trying to send a request to `server1` over its `cluster.https_address`, which is unreachable from the client.
The issue was originally reported [here](https://discuss.linuxcontainers.org/t/setting-up-incusos-cluster-with-a-separate-internal-network/26412).
### Expected behavior
I expect the client to communicate with the cluster nodes only over the remote address that is set for this cluster:
```
incus remote add my-cluster 10.10.0.11:8443
```
### Steps to reproduce
The issue can be reproduced using IncusOS VMs on any installation of Incus 6.22 as follows:
1. Create a separate project and two networks: for public traffic and for the internal traffic:
```
incus project create incus-os
incus project switch incus-os
incus network create uplink ipv4.address=10.10.0.1/24 ipv6.address=none
# Create a network to simulate an isolated layer 2 network switch
incus network create internal ipv4.address=none ipv6.address=none ipv6.address=none ipv4.nat=false ipv6.nat=false
```
2. Update the default profile to use the uplink network
```
cat <<EOF | incus profile edit default
description: Default Incus profile for project incus-os
devices:
eth0:
name: eth0
network: uplink
type: nic
root:
path: /
pool: local
type: disk
EOF
```
3. Create a new profile for IncusOS VMs that use the dual network setup
```
cat <<EOF | incus profile edit incus-os
description: Default profile for IncusOS VMs
devices:
eth0:
network: uplink
type: nic
eth1:
network: internal
type: nic
root:
path: /
pool: local
size: 50GiB
type: disk
vtpm:
type: tpm
EOF
```
4. Bootstrap a VM for Incus client
```
incus launch images:debian/13 client --vm
incus exec client -- bash
# Inside container client
apt update
apt install curl dosfstools gpg -y
# Following the instructions to install the stable version of Incus (6.22)
# https://github.com/zabbly/incus
mkdir -p /etc/apt/keyrings/
curl -fsSL https://pkgs.zabbly.com/key.asc -o /etc/apt/keyrings/zabbly.asc
cat <<EOF > /etc/apt/sources.list.d/zabbly-incus-stable.sources
Enabled: yes
Types: deb
URIs: https://pkgs.zabbly.com/incus/stable
Suites: $(. /etc/os-release && echo ${VERSION_CODENAME})
Components: main
Architectures: $(dpkg --print-architecture)
Signed-By: /etc/apt/keyrings/zabbly.asc
EOF
apt update
apt install incus-client -y
exit
```
5. Create an custom volume for IncusOS ISO
```
wget https://images.linuxcontainers.org/os/202603240012/x86_64/IncusOS_202603240012.iso.gz
gunzip IncusOS_202603240012.iso.gz
incus storage volume import local IncusOS_202603240012.iso IncusOS_202603240012.iso --type=iso
```
6. Create the installation seed for server1
```
incus storage volume create local seed --type=block size=1MB
incus config device add client seed disk pool=local source=seed
incus exec client -- bash
# on client:
echo 'type=c' | sfdisk /dev/sdb
mkfs.vfat -F 32 -n "SEED_DATA" /dev/sdb1
mkdir -p /mnt/seed
mount /dev/sdb1 /mnt/seed
cat <<EOF > /mnt/seed/applications.yaml
version: "1"
applications:
- name: incus
EOF
cat <<EOF > /mnt/seed/install.yaml
force_install: true
target:
id: scsi-0QEMU_QEMU_HARDDISK_incus_root
EOF
cat <<EOF > /mnt/seed/incus.yaml
version: "1"
apply_defaults: false
preseed:
certificates:
- name: admin
type: client
certificate: |-
$(incus remote get-client-certificate | sed 's/^/ /')
description: Initial admin client
EOF
cat <<EOF > /mnt/seed/network.yaml
interfaces:
- addresses:
- 10.10.0.11/24
hwaddr: enp5s0
name: uplink
routes:
- to: 0.0.0.0/0
via: 10.10.0.1
roles:
- management
- addresses:
- 20.0.0.11/24
hwaddr: enp6s0
name: internal
roles:
- cluster
dns:
nameservers:
- 10.10.0.1
EOF
umount /mnt/seed
exit
incus config device remove client seed
```
7. Bootstrap server1
```
incus init --empty --vm server1 -c security.secureboot=false -c limits.cpu=1 -c limits.memory=4GiB --profile incus-os
incus config device add server1 boot-media disk pool=local source=IncusOS_202603240012.iso boot.priority=10
incus config device add server1 seed disk pool=local source=seed
incus start server1
# Wait until the installation is finished
incus config device remove server1 boot-media
incus config device remove server1 seed
```
8. Bootstrap server2
```
incus config device add client seed disk pool=local source=seed
incus exec client -- mount /dev/sdb1 /mnt/seed/
incus exec client -- vi /mnt/seed/network.yaml
# modify addresses to 10.10.0.12 and 20.0.0.12
incus exec client -- umount /mnt/seed/
incus config device remove client seed
incus init --empty --vm server2 -c security.secureboot=false -c limits.cpu=1 -c limits.memory=4GiB --profile incus-os
incus config device add server2 boot-media disk pool=local source=IncusOS_202603240012.iso boot.priority=10
incus config device add server2 seed disk pool=local source=seed
incus start server2
# Wait until the installation is finished
incus config device remove server2 boot-media
incus config device remove server2 seed
```
9. Bootstrap server3
```
incus config device add client seed disk pool=local source=seed
incus exec client -- mount /dev/sdb1 /mnt/seed/
incus exec client -- vi /mnt/seed/network.yaml
# modify addresses to 10.10.0.13 and 20.0.0.13
incus exec client -- umount /mnt/seed/
incus config device remove client seed
incus init --empty --vm server3 -c security.secureboot=false -c limits.cpu=1 -c limits.memory=4GiB --profile incus-os
incus config device add server3 boot-media disk pool=local source=IncusOS_202603240012.iso boot.priority=10
incus config device add server3 seed disk pool=local source=seed
incus start server3
# Wait until the installation is finished
incus config device remove server3 boot-media
incus config device remove server3 seed
```
10. Verify
```
# Confirm that all servers are running and have correct IP addresses
incus ls -cns4t
+---------+---------+------------------------+-----------------+
| NAME | STATE | IPV4 | TYPE |
+---------+---------+------------------------+-----------------+
| client | RUNNING | 10.10.0.4 (enp5s0) | VIRTUAL-MACHINE |
+---------+---------+------------------------+-----------------+
| server1 | RUNNING | 20.0.0.11 (_vinternal) | VIRTUAL-MACHINE |
| | | 10.10.0.11 (_vuplink) | |
+---------+---------+------------------------+-----------------+
| server2 | RUNNING | 20.0.0.12 (_vinternal) | VIRTUAL-MACHINE |
| | | 10.10.0.12 (_vuplink) | |
+---------+---------+------------------------+-----------------+
| server3 | RUNNING | 20.0.0.13 (_vinternal) | VIRTUAL-MACHINE |
| | | 10.10.0.13 (_vuplink) | |
+---------+---------+------------------------+-----------------+
# Verify that client CANNOT access the internal network
incus exec client -- ping -c 3 20.0.0.11
> PING 20.0.0.11 (20.0.0.11) 56(84) bytes of data.
> From 20.0.0.11 icmp_seq=1 Destination Host Unreachable
> From 20.0.0.11 icmp_seq=2 Destination Host Unreachable
> From 20.0.0.11 icmp_seq=3 Destination Host Unreachable
```
11. [Optional] Create snapshots of the servers (in case something goes wrong)
```
incus stop server1 server2 server3
incus snapshot create server1 after-install
incus snapshot create server2 after-install
incus snapshot create server3 after-install
incus start server1 server2 server3
```
12. Bootstrap IncusOS cluster using [the tutorial](https://linuxcontainers.org/incus-os/docs/main/tutorials/incus-cluster/)
```
incus exec client -- bash
# on client:
incus remote add server1 10.10.0.11
incus remote add server2 10.10.0.12
incus remote add server3 10.10.0.13
incus config set server1: cluster.https_address=20.0.0.11:8443
incus cluster enable server1: server1
> Clustering enabled
incus remote add my-cluster 10.10.0.11:8443
> Certificate fingerprint: b6ff5c85aed196eee22ae7e7364dd5dd0d6700a72e09ce8f44b6c1861bf818fe
> ok (y/n/[fingerprint])? y
incus cluster join my-cluster: server2:
> What IP address or DNS name should be used to reach this server? [default=20.0.0.12]:
> What member name should be used to identify this server in the cluster? [default=35ef33da-ee1d-4277-a245-aa2d6baa7be5]: server2
> All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
> Error connecting to existing cluster member "20.0.0.11:8443": Get "https://20.0.0.11:8443": Unable to connect to: 20.0.0.11:8443 ([dial tcp 20.0.0.11:8443: i/o timeout])
```
13. Observe that despite the connection error the cluster appears to have formed
```
incus cluster ls my-cluster:
+---------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| NAME | URL | ROLES | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATUS | MESSAGE |
+---------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| server1 | https://20.0.0.11:8443 | database-leader | x86_64 | default | | ONLINE | Fully operational |
| | | database | | | | | |
+---------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| server2 | https://20.0.0.12:8443 | database-standby | x86_64 | default | | ONLINE | Fully operational |
+---------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
exit
```
14. Make the internal network reachable and join server3
```
incus config device add client eth1 nic network=internal
incus exec client -- bash
# on client:
ip addr add 20.0.0.10/24 dev enp6s0
ip link set enp6s0 up
# Verify that the container can ping the internal network
ping -c 3 20.0.0.11
> PING 20.0.0.11 (20.0.0.11) 56(84) bytes of data.
> 64 bytes from 20.0.0.11: icmp_seq=1 ttl=64 time=1.57 ms
> 64 bytes from 20.0.0.11: icmp_seq=2 ttl=64 time=0.792 ms
> 64 bytes from 20.0.0.11: icmp_seq=3 ttl=64 time=0.676 ms
# Add server3 to cluster
incus cluster join my-cluster: server3:
> What IP address or DNS name should be used to reach this server? [default=20.0.0.13]:
> What member name should be used to identify this server in the cluster? [default=8f1cd2bf-ea20-4c51-a408-d1b6dac91e97]: server3
> All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
# Observe that no error takes place any more
incus cluster ls my-cluster:
+---------+------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| NAME | URL | ROLES | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATUS | MESSAGE |
+---------+------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| server1 | https://20.0.0.11:8443 | database-leader | x86_64 | default | | ONLINE | Fully operational |
| | | database | | | | | |
+---------+------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| server2 | https://20.0.0.12:8443 | database | x86_64 | default | | ONLINE | Fully operational |
+---------+------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| server3 | https://20.0.0.13:8443 | database | x86_64 | default | | ONLINE | Fully operational |
+---------+------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
```
Also, even if it is a race condition, I do not think the client should ever send a request to remote hosts using cluster.https_address.