'Error: value too large for defined data type' problem when exec'ing newly created instances

I’m using LXD 4.23 on RHEL8 (updated and upgraded) and I’m creating Centos/8-Stream/cloud and Almalinux/8/cloud instances via python/plxd and profile configured cloud-init user-data etc.

LXD is configured with lxd init --auto so no custom filesystems or networking etc.

My scripts monitor the cloud-init install and they all succeeded… But sometimes when I try an lxc exec to them I get the error ‘Error: value too large for defined data type’ and the command bails.

If I do a restart of the instance 'lxc restart ’ I see the message ‘Remapping disk’ and afterwards they work.

Does anyone know what this issue is and how to fix it?

That’s a very odd error which doesn’t seem to originate from LXD itself (no hit in the codebase).

When this happens, any chance you can run the command with --debug to try and get more details? Server side log (lxd.log) may also help.

I wonder if a resource limit has been hit… But yes I’ll do a --debug the next time it occurs which will probably be this morning as I’m testing my solution.

Actual error response is at the end… Apologies for the loss of debug formatting… The container doesn’t appear to be in a usable state even though the cloud-init task completed…

^C[atcore@DEV-CACHE-QA1 shard]$ lxc --debug exec avshard-2-22031533-APLPRD bash > t
DBUG[03-16|10:11:15] Connecting to a local LXD over a Unix socket
DBUG[03-16|10:11:15] Sending request to LXD                   method=GET url=http://unix.socket/1.0 etag=
DBUG[03-16|10:11:15] Got response struct from LXD
DBUG[03-16|10:11:15]
	{
		"config": {
			"core.https_address": "[::]:8443",
			"core.trust_password": true
		},
		"api_extensions": [
			"storage_zfs_remove_snapshots",
			"container_host_shutdown_timeout",
			"container_stop_priority",
			"container_syscall_filtering",
			"auth_pki",
			"container_last_used_at",
			"etag",
			"patch",
			"usb_devices",
			"https_allowed_credentials",
			"image_compression_algorithm",
			"directory_manipulation",
			"container_cpu_time",
			"storage_zfs_use_refquota",
			"storage_lvm_mount_options",
			"network",
			"profile_usedby",
			"container_push",
			"container_exec_recording",
			"certificate_update",
			"container_exec_signal_handling",
			"gpu_devices",
			"container_image_properties",
			"migration_progress",
			"id_map",
			"network_firewall_filtering",
			"network_routes",
			"storage",
			"file_delete",
			"file_append",
			"network_dhcp_expiry",
			"storage_lvm_vg_rename",
			"storage_lvm_thinpool_rename",
			"network_vlan",
			"image_create_aliases",
			"container_stateless_copy",
			"container_only_migration",
			"storage_zfs_clone_copy",
			"unix_device_rename",
			"storage_lvm_use_thinpool",
			"storage_rsync_bwlimit",
			"network_vxlan_interface",
			"storage_btrfs_mount_options",
			"entity_description",
			"image_force_refresh",
			"storage_lvm_lv_resizing",
			"id_map_base",
			"file_symlinks",
			"container_push_target",
			"network_vlan_physical",
			"storage_images_delete",
			"container_edit_metadata",
			"container_snapshot_stateful_migration",
			"storage_driver_ceph",
			"storage_ceph_user_name",
			"resource_limits",
			"storage_volatile_initial_source",
			"storage_ceph_force_osd_reuse",
			"storage_block_filesystem_btrfs",
			"resources",
			"kernel_limits",
			"storage_api_volume_rename",
			"macaroon_authentication",
			"network_sriov",
			"console",
			"restrict_devlxd",
			"migration_pre_copy",
			"infiniband",
			"maas_network",
			"devlxd_events",
			"proxy",
			"network_dhcp_gateway",
			"file_get_symlink",
			"network_leases",
			"unix_device_hotplug",
			"storage_api_local_volume_handling",
			"operation_description",
			"clustering",
			"event_lifecycle",
			"storage_api_remote_volume_handling",
			"nvidia_runtime",
			"container_mount_propagation",
			"container_backup",
			"devlxd_images",
			"container_local_cross_pool_handling",
			"proxy_unix",
			"proxy_udp",
			"clustering_join",
			"proxy_tcp_udp_multi_port_handling",
			"network_state",
			"proxy_unix_dac_properties",
			"container_protection_delete",
			"unix_priv_drop",
			"pprof_http",
			"proxy_haproxy_protocol",
			"network_hwaddr",
			"proxy_nat",
			"network_nat_order",
			"container_full",
			"candid_authentication",
			"backup_compression",
			"candid_config",
			"nvidia_runtime_config",
			"storage_api_volume_snapshots",
			"storage_unmapped",
			"projects",
			"candid_config_key",
			"network_vxlan_ttl",
			"container_incremental_copy",
			"usb_optional_vendorid",
			"snapshot_scheduling",
			"snapshot_schedule_aliases",
			"container_copy_project",
			"clustering_server_address",
			"clustering_image_replication",
			"container_protection_shift",
			"snapshot_expiry",
			"container_backup_override_pool",
			"snapshot_expiry_creation",
			"network_leases_location",
			"resources_cpu_socket",
			"resources_gpu",
			"resources_numa",
			"kernel_features",
			"id_map_current",
			"event_location",
			"storage_api_remote_volume_snapshots",
			"network_nat_address",
			"container_nic_routes",
			"rbac",
			"cluster_internal_copy",
			"seccomp_notify",
			"lxc_features",
			"container_nic_ipvlan",
			"network_vlan_sriov",
			"storage_cephfs",
			"container_nic_ipfilter",
			"resources_v2",
			"container_exec_user_group_cwd",
			"container_syscall_intercept",
			"container_disk_shift",
			"storage_shifted",
			"resources_infiniband",
			"daemon_storage",
			"instances",
			"image_types",
			"resources_disk_sata",
			"clustering_roles",
			"images_expiry",
			"resources_network_firmware",
			"backup_compression_algorithm",
			"ceph_data_pool_name",
			"container_syscall_intercept_mount",
			"compression_squashfs",
			"container_raw_mount",
			"container_nic_routed",
			"container_syscall_intercept_mount_fuse",
			"container_disk_ceph",
			"virtual-machines",
			"image_profiles",
			"clustering_architecture",
			"resources_disk_id",
			"storage_lvm_stripes",
			"vm_boot_priority",
			"unix_hotplug_devices",
			"api_filtering",
			"instance_nic_network",
			"clustering_sizing",
			"firewall_driver",
			"projects_limits",
			"container_syscall_intercept_hugetlbfs",
			"limits_hugepages",
			"container_nic_routed_gateway",
			"projects_restrictions",
			"custom_volume_snapshot_expiry",
			"volume_snapshot_scheduling",
			"trust_ca_certificates",
			"snapshot_disk_usage",
			"clustering_edit_roles",
			"container_nic_routed_host_address",
			"container_nic_ipvlan_gateway",
			"resources_usb_pci",
			"resources_cpu_threads_numa",
			"resources_cpu_core_die",
			"api_os",
			"container_nic_routed_host_table",
			"container_nic_ipvlan_host_table",
			"container_nic_ipvlan_mode",
			"resources_system",
			"images_push_relay",
			"network_dns_search",
			"container_nic_routed_limits",
			"instance_nic_bridged_vlan",
			"network_state_bond_bridge",
			"usedby_consistency",
			"custom_block_volumes",
			"clustering_failure_domains",
			"resources_gpu_mdev",
			"console_vga_type",
			"projects_limits_disk",
			"network_type_macvlan",
			"network_type_sriov",
			"container_syscall_intercept_bpf_devices",
			"network_type_ovn",
			"projects_networks",
			"projects_networks_restricted_uplinks",
			"custom_volume_backup",
			"backup_override_name",
			"storage_rsync_compression",
			"network_type_physical",
			"network_ovn_external_subnets",
			"network_ovn_nat",
			"network_ovn_external_routes_remove",
			"tpm_device_type",
			"storage_zfs_clone_copy_rebase",
			"gpu_mdev",
			"resources_pci_iommu",
			"resources_network_usb",
			"resources_disk_address",
			"network_physical_ovn_ingress_mode",
			"network_ovn_dhcp",
			"network_physical_routes_anycast",
			"projects_limits_instances",
			"network_state_vlan",
			"instance_nic_bridged_port_isolation",
			"instance_bulk_state_change",
			"network_gvrp",
			"instance_pool_move",
			"gpu_sriov",
			"pci_device_type",
			"storage_volume_state",
			"network_acl",
			"migration_stateful",
			"disk_state_quota",
			"storage_ceph_features",
			"projects_compression",
			"projects_images_remote_cache_expiry",
			"certificate_project",
			"network_ovn_acl",
			"projects_images_auto_update",
			"projects_restricted_cluster_target",
			"images_default_architecture",
			"network_ovn_acl_defaults",
			"gpu_mig",
			"project_usage",
			"network_bridge_acl",
			"warnings",
			"projects_restricted_backups_and_snapshots",
			"clustering_join_token",
			"clustering_description",
			"server_trusted_proxy",
			"clustering_update_cert",
			"storage_api_project",
			"server_instance_driver_operational",
			"server_supported_storage_drivers",
			"event_lifecycle_requestor_address",
			"resources_gpu_usb",
			"clustering_evacuation",
			"network_ovn_nat_address",
			"network_bgp",
			"network_forward",
			"custom_volume_refresh",
			"network_counters_errors_dropped",
			"metrics",
			"image_source_project",
			"clustering_config",
			"network_peer",
			"linux_sysctl",
			"network_dns",
			"ovn_nic_acceleration",
			"certificate_self_renewal",
			"instance_project_move",
			"storage_volume_project_move",
			"cloud_init",
			"network_dns_nat",
			"database_leader",
			"instance_all_projects",
			"clustering_groups",
			"ceph_rbd_du",
			"instance_get_full",
			"qemu_metrics",
			"gpu_mig_uuid",
			"event_project",
			"clustering_evacuation_live",
			"instance_allow_inconsistent_copy",
			"network_state_ovn",
			"storage_volume_api_filtering",
			"image_restrictions",
			"storage_zfs_export",
			"network_dns_records",
			"storage_zfs_reserve_space",
			"network_acl_log",
			"storage_zfs_blocksize",
			"metrics_cpu_seconds",
			"instance_snapshot_never",
			"certificate_token",
			"instance_nic_routed_neighbor_probe",
			"event_hub",
			"agent_nic_config",
			"projects_restricted_intercept",
			"metrics_authentication"
		],
		"api_status": "stable",
		"api_version": "1.0",
		"auth": "trusted",
		"public": false,
		"auth_methods": [
			"tls"
		],
		"environment": {
			"addresses": [
				"10.21.75.39:8443",
				"10.208.75.1:8443"
			],
			"architectures": [
				"x86_64",
				"i686"
			],
			"certificate": "-----BEGIN CERTIFICATE-----\nMIICNzCCAb2gAwIBAgIRAImPhOFcirpj7vF73LzWs60wCgYIKoZIzj0EAwMwRTEc\nMBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzElMCMGA1UEAwwccm9vdEBERVYt\nQ0FDSEUtUUExLmNibC5sb2NhbDAeFw0yMjAzMDQxNDAzMThaFw0zMjAzMDExNDAz\nMThaMEUxHDAaBgNVBAoTE2xpbnV4Y29udGFpbmVycy5vcmcxJTAjBgNVBAMMHHJv\nb3RAREVWLUNBQ0hFLVFBMS5jYmwubG9jYWwwdjAQBgcqhkjOPQIBBgUrgQQAIgNi\nAAQL9xKYutaCZ+Q8Xbi8113Ke6jVVi5RaKdreR5Tq/JvYu+3MGrABax3n8ePyFuv\nLSmd2Dk8QOKaVdXhUlK3AJi5mxnXvQi/l3yDXxuF3hoocqCJZ5fW/W9d5J1vgvDE\nuYqjcTBvMA4GA1UdDwEB/wQEAwIFoDATBgNVHSUEDDAKBggrBgEFBQcDATAMBgNV\nHRMBAf8EAjAAMDoGA1UdEQQzMDGCF0RFVi1DQUNIRS1RQTEuY2JsLmxvY2FshwR/\nAAABhxAAAAAAAAAAAAAAAAAAAAABMAoGCCqGSM49BAMDA2gAMGUCMCizWr0tkPe+\nR9bHjZCEE3Cc/yBcUM4MKGOzRP+ckIeRpq+F3I8SuDzRR+kyiVX2BAIxANaJ8vM5\nKt/G7938GnkVuVRBFzqnXd7dr0SsdKzmNyor2ZnoUfrBDtRFeoP6Wjpbnw==\n-----END CERTIFICATE-----\n",
			"certificate_fingerprint": "a4d368f4d347cfceb35c0af3a46df0a8f0148bdf3b56192a2e42d022dab86761",
			"driver": "qemu | lxc",
			"driver_version": "6.1.1 | 4.0.12",
			"firewall": "xtables",
			"kernel": "Linux",
			"kernel_architecture": "x86_64",
			"kernel_features": {
				"idmapped_mounts": "false",
				"netnsid_getifaddrs": "true",
				"seccomp_listener": "false",
				"seccomp_listener_continue": "false",
				"shiftfs": "false",
				"uevent_injection": "true",
				"unpriv_fscaps": "true"
			},
			"kernel_version": "4.18.0-348.12.2.el8_5.x86_64",
			"lxc_features": {
				"cgroup2": "true",
				"core_scheduling": "true",
				"devpts_fd": "true",
				"idmapped_mounts_v2": "true",
				"mount_injection_file": "true",
				"network_gateway_device_route": "true",
				"network_ipvlan": "true",
				"network_l2proxy": "true",
				"network_phys_macvlan_mtu": "true",
				"network_veth_router": "true",
				"pidfd": "true",
				"seccomp_allow_deny_syntax": "true",
				"seccomp_notify": "true",
				"seccomp_proxy_send_notify_fd": "true"
			},
			"os_name": "Red Hat Enterprise Linux",
			"os_version": "8.5",
			"project": "default",
			"server": "lxd",
			"server_clustered": false,
			"server_event_mode": "full-mesh",
			"server_name": "DEV-CACHE-QA1.cbl.local",
			"server_pid": 4651,
			"server_version": "4.24",
			"storage": "dir",
			"storage_version": "1",
			"storage_supported_drivers": [
				{
					"Name": "cephfs",
					"Version": "15.2.14",
					"Remote": true
				},
				{
					"Name": "dir",
					"Version": "1",
					"Remote": false
				},
				{
					"Name": "lvm",
					"Version": "2.03.07(2) (2019-11-30) / 1.02.167 (2019-11-30) / 4.43.0",
					"Remote": false
				},
				{
					"Name": "ceph",
					"Version": "15.2.14",
					"Remote": true
				},
				{
					"Name": "btrfs",
					"Version": "5.4.1",
					"Remote": false
				}
			]
		}
	}
DBUG[03-16|10:11:15] Connected to the websocket: ws://unix.socket/1.0/events
DBUG[03-16|10:11:15] Sending request to LXD                   method=POST url=http://unix.socket/1.0/instances/avshard-2-22031533-APLPRD/exec etag=
DBUG[03-16|10:11:15]
	{
		"command": [
			"bash"
		],
		"wait-for-websocket": true,
		"interactive": false,
		"environment": {
			"TERM": "screen-256color"
		},
		"width": 0,
		"height": 0,
		"record-output": false,
		"user": 0,
		"group": 0,
		"cwd": ""
	}
DBUG[03-16|10:11:15] Got operation from LXD
DBUG[03-16|10:11:15]
	{
		"id": "c107a10e-4ad2-4973-8ffe-9d3ce891cf56",
		"class": "websocket",
		"description": "Executing command",
		"created_at": "2022-03-16T10:11:15.041627658Z",
		"updated_at": "2022-03-16T10:11:15.041627658Z",
		"status": "Running",
		"status_code": 103,
		"resources": {
			"containers": [
				"/1.0/containers/avshard-2-22031533-APLPRD"
			],
			"instances": [
				"/1.0/instances/avshard-2-22031533-APLPRD"
			]
		},
		"metadata": {
			"command": [
				"bash"
			],
			"environment": {
				"HOME": "/root",
				"LANG": "C.UTF-8",
				"PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
				"TERM": "screen-256color",
				"USER": "root"
			},
			"fds": {
				"0": "e280d554777e2071ec3b103409b05b74d99b228525dda70b1d79f4d162875fa0",
				"1": "372d297dd28936ee74f7085faf178238f1b4acb23790f32f600c8b9e4214bdb3",
				"2": "b3fa4f8059175e94a75d03ed7ac5775f3453edcafe80bb8a6116aced535ce269",
				"control": "c31877b220ffa4c485823a686061b9aa997ab80da4277e8aea48055d3c3d421d"
			},
			"interactive": false
		},
		"may_cancel": false,
		"err": "",
		"location": "none"
	}
DBUG[03-16|10:11:15] Connected to the websocket: ws://unix.socket/1.0/operations/c107a10e-4ad2-4973-8ffe-9d3ce891cf56/websocket?secret=c31877b220ffa4c485823a686061b9aa997ab80da4277e8aea48055d3c3d421d
DBUG[03-16|10:11:15] Connected to the websocket: ws://unix.socket/1.0/operations/c107a10e-4ad2-4973-8ffe-9d3ce891cf56/websocket?secret=e280d554777e2071ec3b103409b05b74d99b228525dda70b1d79f4d162875fa0
DBUG[03-16|10:11:15] Connected to the websocket: ws://unix.socket/1.0/operations/c107a10e-4ad2-4973-8ffe-9d3ce891cf56/websocket?secret=372d297dd28936ee74f7085faf178238f1b4acb23790f32f600c8b9e4214bdb3
DBUG[03-16|10:11:15] Connected to the websocket: ws://unix.socket/1.0/operations/c107a10e-4ad2-4973-8ffe-9d3ce891cf56/websocket?secret=b3fa4f8059175e94a75d03ed7ac5775f3453edcafe80bb8a6116aced535ce269
DBUG[03-16|10:11:15] Sending request to LXD                   method=GET url=http://unix.socket/1.0/operations/c107a10e-4ad2-4973-8ffe-9d3ce891cf56 etag=
DBUG[03-16|10:11:15] Got response struct from LXD
DBUG[03-16|10:11:15]
	{
		"id": "c107a10e-4ad2-4973-8ffe-9d3ce891cf56",
		"class": "websocket",
		"description": "Executing command",
		"created_at": "2022-03-16T10:11:15.041627658Z",
		"updated_at": "2022-03-16T10:11:15.041627658Z",
		"status": "Running",
		"status_code": 103,
		"resources": {
			"containers": [
				"/1.0/containers/avshard-2-22031533-APLPRD"
			],
			"instances": [
				"/1.0/instances/avshard-2-22031533-APLPRD"
			]
		},
		"metadata": {
			"command": [
				"bash"
			],
			"environment": {
				"HOME": "/root",
				"LANG": "C.UTF-8",
				"PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
				"TERM": "screen-256color",
				"USER": "root"
			},
			"fds": {
				"0": "e280d554777e2071ec3b103409b05b74d99b228525dda70b1d79f4d162875fa0",
				"1": "372d297dd28936ee74f7085faf178238f1b4acb23790f32f600c8b9e4214bdb3",
				"2": "b3fa4f8059175e94a75d03ed7ac5775f3453edcafe80bb8a6116aced535ce269",
				"control": "c31877b220ffa4c485823a686061b9aa997ab80da4277e8aea48055d3c3d421d"
			},
			"interactive": false
		},
		"may_cancel": false,
		"err": "",
		"location": "none"
	}

DBUG[03-16|10:16:43] Connected to the websocket: ws://unix.socket/1.0/operations/8311e3ea-a96b-4868ae9f5af7505a8505/websocketsecret=4b05789af98d82434531eec4a63b77f9f8e006aacc3ea43d340b0a4236a048da
DBUG[03-16|10:16:43] Connected to the websocket: ws://unix.socket/1.0/operations/8311e3ea-a96b-4868ae9f5af7505a8505/websocketsecret=74c65c0db286a753e41d112ad073d6dd0fedef2f4ceb8917fbe281a08f961214
DBUG[03-16|10:16:43] Sending request to LXD                   method=GET url=http://unix.socket/1.0/operations/8311e3ea-a96b-4868-ae9f-5af7505a8505 etag=
DBUG[03-16|10:16:43] WebsocketRecvStream got error getting next reader err="websocket: close 1006 (abnormal closure): unexpected EOF"
DBUG[03-16|10:16:43] Got response struct from LXD
DBUG[03-16|10:16:43]
                               	{
                                 		"id": "8311e3ea-a96b-4868-ae9f-5af7505a8505",
                                                                                             		"class": "websocket",
                    		"description": "Executing command",
                                                                   		"created_at": "2022-03-16T10:16:43.191970711Z",
                          		"updated_at": "2022-03-16T10:16:43.191970711Z",
                                                                                       		"status": "Failure",
               		"status_code": 400,
                                           		"resources": {
                                                                      			"containers": [
  				"/1.0/containers/avshard-2-22031533-APLPRD"
                                                                           			],
                                                                                                  	"instances": [
             				"/1.0/instances/avshard-2-22031533-APLPRD"
                                                                                  			]
},
  		"metadata": {
                             			"command": [
                                                            				"ls"
                                                                                            		],
 			"environment": {
                                        				"HOME": "/root",
                                                                                        		"LANG": "C.UTF-8",
                 				"PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                  				"TERM": "screen-256color",
                                                                          				"USER": "root"
             			},
                                  			"fds": {
                                                                				"0": "74c65c0db286a753e41d112ad073d6dd0fedef2f4ceb8917fbe281a08f961214",
                                                                   				"control": "4b05789af98d82434531eec4a63b77f9f8e006aacc3ea43d340b0a4236a048da"
                                                                        			},
                                                                                                  	"interactive": true
                  		},
                                  		"may_cancel": false,
                                                                    		"err": "value too large for defined data type",
                          		"location": "none"
                                                          	}
                                                                  Error: value too large for defined data type

Also to be noted I can’t stop the instance without using --force… When I used force the hosts default route failed and when I got into the box via another machine on the local subnet I could see the instance had stopped…

Now the issue with the default route appears to be that a default route has been added for the lxdbr()…

default via 10.208.75.1 dev vethbb45249a proto dhcp metric 102
default via 10.21.75.214 dev bond0 proto static metric 300
10.21.0.0/16 dev bond0 proto kernel scope link src 10.21.75.39 metric 300
10.208.75.0/24 dev vethbb45249a proto kernel scope link src 10.208.75.109 metric 102

With lxd disabled the routing take looks like

default via 10.21.75.214 dev bond0 proto static metric 300
10.21.0.0/16 dev bond0 proto kernel scope link src 10.21.75.39 metric 300

When I reenable (after a reboot) the lxd routing is:-

default via 10.21.75.214 dev bond0 proto static metric 300
10.21.0.0/16 dev bond0 proto kernel scope link src 10.21.75.39 metric 300
10.208.75.0/24 dev lxdbr0 proto kernel scope link src 10.208.75.1

This looks easily reproducible if you want any other information…

Is the container operational if you don’t use cloud-init config? I’d first confirm that the container functions with/without cloud-init before looking at the exec problem.

I’ve not noticed any different if cloud-init runs or not…

My pylxd script adds proxy devices to the container after cloud-init has completed… I’ve commented those out and it doesn’t appear to lock up any more…

I’ll have to run the script a few times to see if its a solid lead…

What I mean is your post here 'Error: value too large for defined data type' problem when exec'ing newly created instances - #5 by Ozymandias suggests there are larger issues with the container than not being able to run specific exec commands.

So those need to be resolved first.

Can we back up a bit and clarify the problem.

Can you launch a fresh container with that configuration and then just run lxc exec <instance> -- bash to get into it?

I created three instances test,test2,test3 using a command like…

lxc launch images:centos/8-Stream/cloud test2

And there are no problems with lxc exec…

However none of them have been assigned an IP address. This is due to the fact that NetworkManager in the cloud image ignores veth devices.

I work around that in my cloud-init by doing the following commands in the bootcmd section…

    bootcmd:
    - [ cloud-init-per, once, nmdis, systemctl, disable, NetworkManager, --now ]
    - [ cloud-init-per, once, eth0up, dhclient, eth0 ]
    - [ cloud-init-per, once, epel, yum, -y, install, epel-release, network-scripts ]
    - [ cloud-init-per, once, nwup, systemctl, enable, network, --now ]

And then you can exec into them manually OK?

No problems so far… Though its worth noting sometimes I can exec into the cloud-inited instances as well… Its not reproducible 100% of the time…

OK so 'Error: value too large for defined data type' problem when exec'ing newly created instances - #5 by Ozymandias wasn’t really about this thread?

So looking at this, the command being run is just bash? is that correct, and that generates the error?

I’m wondering how it is different to what you’re doing when it works in 'Error: value too large for defined data type' problem when exec'ing newly created instances - #12 by Ozymandias

Step by step reproducer steps would be great, thanks! :slight_smile:

I don’t understand?. There appears is a problem when I create instances where they stop working and I can’t exec into them.

The NetworkManger issue with the RHEL derivatives has a workaround and hopefully somebody will fix the images at some point in the future. This isn’t relevant to this problem but since none of the containers have IP addresses its a problem…

I’ve just deleted and recreated the three test instances using cloud-init and cloud-init has succeeded and they are all contactable.

So no difference with or without cloud-init. Which is why Started to look at what else my pylxd script was doing… e.g. proxies…

Yes the whole networking post confused me, as you should be able to exec into a container even if the network isn’t running. I’ll just discount that part in my head.

But I’m still a bit confused as to the steps that cause the issue, as in 'Error: value too large for defined data type' problem when exec'ing newly created instances - #12 by Ozymandias you said you can exec into the container.

Some more information:-

My pylxd script creates the instance, starts the instance and then execs ‘cloud-init status --wait’ so that I can see the result of the install. That appears to work…

However the containers are sometimes broken after that point… Which is odd as they have IP addresses and the cloud-init ran and produced.

When I attempt to restart them 'lxc restart ’ that hangs…
When I force delete them 'lxc restore --force that adds a default route to the host machines routing table to lxdbr0 and that breaks networking outside the local subnet.

A thought as occurred I’m running my script inside a container which is connecting to the hosts lxd unix socket… Let me repeat the tests from there…

Can you recreate the issue without pylxd out of interest?

lxc restore doesn’t delete? Do you mean lxc delete -f?

Did you try lxc restart -f too?