Any ideas what’s different about them? Or if you run the command repeatedly do they show? Also have you tried adding a filter for the name of one of the missing instances (while still keeping the 4 option) and see if it appears?
On an unrelated (I think) issue I actually significantly reworked lxc list the other day (not in the snap yet).
So there’s a decent chance this is already fixed.
But I would like to see if we can figure out what is wrong to double check.
To be honest there is absolutely nothing different about them, they all use the same profile. I tried the filter(even from remote) but no avail
❯ lxc ls -c 4n
+----------------------+--------------+
| IPV4 | NAME |
+----------------------+--------------+
| 240.11.0.20 (eth1) | maas-proxy-1 |
| 10.10.10.104 (eth2) | |
+----------------------+--------------+
| 240.11.0.20 (eth1) | maas-proxy-1 |
| 10.10.10.104 (eth2) | |
+----------------------+--------------+
| 240.11.0.20 (eth1) | maas-proxy-1 |
| 10.10.10.104 (eth2) | |
+----------------------+--------------+
| 240.11.0.20 (eth1) | maas-proxy-1 |
| 10.10.10.104 (eth2) | |
+----------------------+--------------+
| 240.12.0.41 (eth1) | maas-vault-1 |
| 10.10.10.107 (eth2) | |
+----------------------+--------------+
| 240.12.0.68 (eth1) | maas-ha-2 |
| 10.10.99.102 (eth3) | |
| 10.10.10.102 (eth2) | |
+----------------------+--------------+
| 240.12.0.95 (eth0) | juju-client |
| 10.64.182.1 (lxdbr0) | |
| 10.10.99.115 (eth2) | |
| 10.10.10.115 (eth1) | |
+----------------------+--------------+
| 240.12.0.129 (eth1) | maas-ha-db-2 |
| 10.10.10.106 (eth2) | |
+----------------------+--------------+
| 240.13.0.92 (eth1) | maas-ha-3 |
| 10.10.99.103 (eth3) | |
| 10.10.10.103 (eth2) | |
+----------------------+--------------+
❯ lxc ls -c 4n maas-ha-db
+---------------------+--------------+
| IPV4 | NAME |
+---------------------+--------------+
| 240.12.0.129 (eth1) | maas-ha-db-2 |
| 10.10.10.106 (eth2) | |
+---------------------+--------------+
❯ lxc ls -c 4n maas-ha-db-1
+------+------+
| IPV4 | NAME |
+------+------+
Cam you show lxc config show <instance> --expanded
for one of the appearing and one of the equivalent missing ones please?
Yes, that was the first thing I did, but maybe we can use your expertise here
WORKING ONE:
architecture: x86_64
config:
boot.autostart: "true"
image.architecture: amd64
image.description: Ubuntu jammy amd64 (20230227_07:42)
image.os: Ubuntu
image.release: jammy
image.serial: "20230227_07:42"
image.type: squashfs
image.variant: cloud
limits.cpu: "2"
limits.memory: 4GB
linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
raw.lxc: |
lxc.apparmor.profile=unconfined
lxc.mount.auto=proc:rw sys:rw
lxc.cap.drop=
security.nesting: "true"
security.privileged: "true"
user.network-config: |-
version: 1
config:
- type: physical
name: eth1
subnets:
- type: static
ipv4: true
address: 10.10.10.102/24
netmask: 255.255.255.0
gateway: 10.10.10.1
control: auto
- type: physical
name: eth2
subnets:
- type: dhcp
- type: physical
name: eth3
subnets:
- type: static
ipv4: true
address: 10.10.99.102/24
netmask: 255.255.255.0
control: auto
- type: nameserver
address: 1.1.1.1
user.user-data: "#cloud-config\npackages:\n - snapd\n - ssh\n - net-tools\n -
jq\n - nano \n - traceroute\n - screen\n - iptables\nusers:\n - default\n
\ - name: passwd: groups: sudo\n sudo: ALL=(ALL)
NOPASSWD:ALL\n ssh_authorized_keys:\n - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDLYl/2jfGlX4oC9caSBTTuPBY0iMI24kK/y6h081CIMmpM8wRmbxPWlhLl+H8o961trhFtoqb9VCxwsztUwIjtGhjMXlLTTdeZ/50O+crOclFnr2kyEyLSKIN1puR1yk/qGOGVpc+F0qh3iUIovZ5V0KP1wDX0Cl/bjAIVsncrukcZgud9FqWPVhebMS2LWehD12jm1m1sdtKBhTS6vANP7mSPXXoPU9xV1DWIgSmbas2qykAQbcQLklxE4bS2AR1uYvtRe2BkSAuzR31fCZiausWNoHWiqhgnd1qJzhc1tI/roDvEqyyJ/F3Tc8NWUg2MOJgdejHgyJ3/9AZCP6A+NN+wIkJeMyhs61C09BOOkhKzLNjB9gbM1Ixpu7YTqMXwUvxg1WRY33M1XoBrIAGpH3ncaw8aTV6vDZUSJOfSWypz4c/MRSfH4qkTwYXavNP2XnMSL9IKI8DB8wMtjGgtEv+hGF/akVW7Z39GhwX0xEMA0f6vrQ1YPjkVz2/1OCfdQMlWULyb9IFmiDeL7nLvL/GV1xze7F3yvwU+MV3x5LLtczo2TKopBE3DlDnd1UYXoHjjLVqxCtuQYvOJhVsNj1ug7m4rkVVzGhn+6nCDnCtTB8zDMvPdm7iWH/J2Szk6xp83FW4zKeyEysnqfP4oeVb3cfT1t87mjCLm/v1shw==
admin@netrix.com.pl\n - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDQKP5UbCSwADFNJsWF9z1zMsEQM6Kh8JOOug4ZYq2JsMYGSqZ0HgeLJzJrfI6OQKUZnyE06wz15UgIS/mlFf8xt2CDIFYc86RTTin8De+2nFEvWgi+sdG4TXqvvMpTfJQTORaGkjuXglG/9gxsKdw1PGi07Bmd9nis0I0Cry+3x7/ULS7UCHi7jnOfegaANQ23EIVmWVpsq2xEabd964IJLZkV7Vnx3HUPsjXyfIn+gZ27QuXfnAfmOYmQ64JPVbFKx573eXAsFe7VQNrPcw73GG4sVH+5Uo2a3Vvw/orBjTkNjQr8fdJ5FSi8o57jeschwOdIVQhiOfv2zU96WdoGtLYDksbd+j3Do69i2ugRr3wjNlAe4n65syUrmwR0QCZgL5SgCt7Fz6uN/wsLsVHexWauPXVEn+nbBjImNKo/LjMC9sb8BxAwpkNfgEz3eXBzUuie0/67WNuxUYbi1w8QPDE0FpsI/610Hpsd4tuV4TWzjouNuXvNSE2CFD+WjP1dRS9wn/tupQUG0/PSat5XEpZtbFumdXDjf+9qATRr7t7aEXgwmZXuJAxtBOXdcZFvA/Iyo34UxzNbnTheqLX3qICqAbusOV0jNMQCMQUzrhyq6XH0RrP6O5a+AmdUYurcqEXg0h/nO90FH7L19S7548vD1O0RkM8chwLE9l+pXw==
mother@infra1\n"
volatile.base_image: dd0d4169886a0f4147142d57f0251191574b15591611126a7645f27b6089d040
volatile.cloud-init.instance-id: daca0c70-976d-4f2b-8778-52443292538a
volatile.eth1.host_name: maca63ac611
volatile.eth1.hwaddr: 00:16:3e:ad:1e:23
volatile.eth1.last_state.created: "false"
volatile.eth2.host_name: veth92dad571
volatile.eth2.hwaddr: 00:16:3e:1e:cd:e5
volatile.eth3.host_name: maccf074dec
volatile.eth3.hwaddr: 00:16:3e:b2:41:48
volatile.eth3.last_state.created: "false"
volatile.idmap.base: "0"
volatile.idmap.current: '[]'
volatile.idmap.next: '[]'
volatile.last_state.idmap: '[]'
volatile.last_state.power: RUNNING
volatile.uuid: a5fa29e9-842a-417f-9ae8-b3bfede3e982
devices:
eth1:
name: eth1
nictype: macvlan
parent: VLAN10
type: nic
eth2:
name: eth2
network: lxdfan0
type: nic
eth3:
name: eth3
nictype: macvlan
parent: VLAN99
type: nic
root:
path: /
pool: local
size: 20GB
type: disk
ephemeral: false
profiles:
- container-vlan10-maas
stateful: false
description: ""
NOT SHOWING:
architecture: x86_64
config:
boot.autostart: "true"
image.architecture: amd64
image.description: Ubuntu jammy amd64 (20230227_07:42)
image.os: Ubuntu
image.release: jammy
image.serial: "20230227_07:42"
image.type: squashfs
image.variant: cloud
limits.cpu: "2"
limits.memory: 4GB
linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
raw.lxc: |
lxc.apparmor.profile=unconfined
lxc.mount.auto=proc:rw sys:rw
lxc.cap.drop=
security.nesting: "true"
security.privileged: "true"
user.network-config: |-
version: 1
config:
- type: physical
name: eth1
subnets:
- type: static
ipv4: true
address: 10.10.10.101/24
netmask: 255.255.255.0
gateway: 10.10.10.1
control: auto
- type: physical
name: eth2
subnets:
- type: dhcp
- type: physical
name: eth3
subnets:
- type: static
ipv4: true
address: 10.10.99.101/24
netmask: 255.255.255.0
control: auto
- type: nameserver
address: 1.1.1.1
user.user-data: "#cloud-config\npackages:\n - snapd\n - ssh\n - net-tools\n -
jq\n - nano \n - traceroute\n - screen\n - iptables\nusers:\n - default\n
\ - name: passwd: groups: sudo\n sudo: ALL=(ALL)
NOPASSWD:ALL\n ssh_authorized_keys:\n - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDLYl/2jfGlX4oC9caSBTTuPBY0iMI24kK/y6h081CIMmpM8wRmbxPWlhLl+H8o961trhFtoqb9VCxwsztUwIjtGhjMXlLTTdeZ/50O+crOclFnr2kyEyLSKIN1puR1yk/qGOGVpc+F0qh3iUIovZ5V0KP1wDX0Cl/bjAIVsncrukcZgud9FqWPVhebMS2LWehD12jm1m1sdtKBhTS6vANP7mSPXXoPU9xV1DWIgSmbas2qykAQbcQLklxE4bS2AR1uYvtRe2BkSAuzR31fCZiausWNoHWiqhgnd1qJzhc1tI/roDvEqyyJ/F3Tc8NWUg2MOJgdejHgyJ3/9AZCP6A+NN+wIkJeMyhs61C09BOOkhKzLNjB9gbM1Ixpu7YTqMXwUvxg1WRY33M1XoBrIAGpH3ncaw8aTV6vDZUSJOfSWypz4c/MRSfH4qkTwYXavNP2XnMSL9IKI8DB8wMtjGgtEv+hGF/akVW7Z39GhwX0xEMA0f6vrQ1YPjkVz2/1OCfdQMlWULyb9IFmiDeL7nLvL/GV1xze7F3yvwU+MV3x5LLtczo2TKopBE3DlDnd1UYXoHjjLVqxCtuQYvOJhVsNj1ug7m4rkVVzGhn+6nCDnCtTB8zDMvPdm7iWH/J2Szk6xp83FW4zKeyEysnqfP4oeVb3cfT1t87mjCLm/v1shw==
admin@netrix.com.pl\n - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDQKP5UbCSwADFNJsWF9z1zMsEQM6Kh8JOOug4ZYq2JsMYGSqZ0HgeLJzJrfI6OQKUZnyE06wz15UgIS/mlFf8xt2CDIFYc86RTTin8De+2nFEvWgi+sdG4TXqvvMpTfJQTORaGkjuXglG/9gxsKdw1PGi07Bmd9nis0I0Cry+3x7/ULS7UCHi7jnOfegaANQ23EIVmWVpsq2xEabd964IJLZkV7Vnx3HUPsjXyfIn+gZ27QuXfnAfmOYmQ64JPVbFKx573eXAsFe7VQNrPcw73GG4sVH+5Uo2a3Vvw/orBjTkNjQr8fdJ5FSi8o57jeschwOdIVQhiOfv2zU96WdoGtLYDksbd+j3Do69i2ugRr3wjNlAe4n65syUrmwR0QCZgL5SgCt7Fz6uN/wsLsVHexWauPXVEn+nbBjImNKo/LjMC9sb8BxAwpkNfgEz3eXBzUuie0/67WNuxUYbi1w8QPDE0FpsI/610Hpsd4tuV4TWzjouNuXvNSE2CFD+WjP1dRS9wn/tupQUG0/PSat5XEpZtbFumdXDjf+9qATRr7t7aEXgwmZXuJAxtBOXdcZFvA/Iyo34UxzNbnTheqLX3qICqAbusOV0jNMQCMQUzrhyq6XH0RrP6O5a+AmdUYurcqEXg0h/nO90FH7L19S7548vD1O0RkM8chwLE9l+pXw==
mother@infra1\n"
volatile.base_image: dd0d4169886a0f4147142d57f0251191574b15591611126a7645f27b6089d040
volatile.cloud-init.instance-id: 7d1f65f3-9b6a-4add-8991-739d0684786d
volatile.eth1.host_name: maca456f93b
volatile.eth1.hwaddr: 00:16:3e:2b:04:50
volatile.eth1.last_state.created: "false"
volatile.eth2.host_name: vethd04e125d
volatile.eth2.hwaddr: 00:16:3e:c8:d9:ec
volatile.eth3.host_name: mac034b1d3b
volatile.eth3.hwaddr: 00:16:3e:9d:1a:27
volatile.eth3.last_state.created: "false"
volatile.idmap.base: "0"
volatile.idmap.current: '[]'
volatile.idmap.next: '[]'
volatile.last_state.idmap: '[]'
volatile.last_state.power: RUNNING
volatile.uuid: 3555f10f-0a5e-44f3-91e9-10adf573d801
devices:
eth1:
name: eth1
nictype: macvlan
parent: VLAN10
type: nic
eth2:
name: eth2
network: lxdfan0
type: nic
eth3:
name: eth3
nictype: macvlan
parent: VLAN99
type: nic
root:
path: /
pool: local
size: 20GB
type: disk
ephemeral: false
profiles:
- container-vlan10-maas
stateful: false
description: ""
I also noticed a very strange thing, namely some new problems with maas installation on these containers, as the name suggests they are used for maas installation.
The problem with listing tends to coincide with the maas initialisation error
ERROR: Failed to retrieve storage information: Failed to find “/dev/disk/by-path/pci-0000:00:1f.2-ata-1”: lstat /dev/sr0: no such file or directory
Can it be somehow related?
Ah yes potentially.
I remember from my reworking of lxc list that in the previous version it skipped entries if there was a problem loading state info.
I’ll look into whether that could be the issue and whether it still exists in its current form.
What does lxc storage show <pool>
show?
mother@infra1:~$ lxc storage show remote
config:
ceph.cluster_name: ceph
ceph.osd.pg_num: "32"
ceph.osd.pool_name: lxd
ceph.user.name: admin
volatile.pool.pristine: "true"
description: ""
name: remote
driver: ceph
used_by:
- /1.0/images/a93b30fc996e880ed5f1bbb197ad5efb1d837c7929f2541016688c5a8279e48d
- /1.0/instances/juju-client?project=maas
- /1.0/profiles/juju-client?project=maas
status: Created
locations:
- infra1
- infra2
- infra3
@tomp thank you very much, I jest realized I assigned wrong pool to my containers, but unfortunately it did not solve anything All continers use remote pool now but lxc ls still shows no valid output.
By the way I just found that the local zfs pool still has some images after removing the provious containers, is it possible to somehow prune zfs pool?
lxc storage show local
config: {}
description: ""
name: local
driver: zfs
used_by:
- /1.0/images/563cc2006e9226472b25dc6b8881886c25c0d9d7ccde11fc788d4fdb6c0278a9?target=infra1
- /1.0/images/563cc2006e9226472b25dc6b8881886c25c0d9d7ccde11fc788d4fdb6c0278a9?target=infra2
- /1.0/images/563cc2006e9226472b25dc6b8881886c25c0d9d7ccde11fc788d4fdb6c0278a9?target=infra3
- /1.0/images/6de3bbd44fa9fdaeef6e5a96b03bb0c09b59b6be7d594bcd3beb92c1956250b6?target=infra1
- /1.0/images/6de3bbd44fa9fdaeef6e5a96b03bb0c09b59b6be7d594bcd3beb92c1956250b6?target=infra2
- /1.0/images/6de3bbd44fa9fdaeef6e5a96b03bb0c09b59b6be7d594bcd3beb92c1956250b6?target=infra3
- /1.0/images/a93b30fc996e880ed5f1bbb197ad5efb1d837c7929f2541016688c5a8279e48d?target=infra1
- /1.0/images/a93b30fc996e880ed5f1bbb197ad5efb1d837c7929f2541016688c5a8279e48d?target=infra2
- /1.0/images/a93b30fc996e880ed5f1bbb197ad5efb1d837c7929f2541016688c5a8279e48d?target=infra3
- /1.0/images/dd0d4169886a0f4147142d57f0251191574b15591611126a7645f27b6089d040?target=infra1
- /1.0/images/dd0d4169886a0f4147142d57f0251191574b15591611126a7645f27b6089d040?target=infra2
- /1.0/images/dd0d4169886a0f4147142d57f0251191574b15591611126a7645f27b6089d040?target=infra3
- /1.0/instances/juju-5b99f0-0?project=maas
- /1.0/instances/juju-843b3a-3?project=openstack
- /1.0/instances/juju-843b3a-4?project=openstack
- /1.0/instances/juju-843b3a-5?project=openstack
- /1.0/instances/u2?project=openstack
- /1.0/profiles/default
- /1.0/profiles/default?project=maas
- /1.0/profiles/default?project=openstack
- /1.0/profiles/juju-controller?project=maas
- /1.0/profiles/juju-openstack?project=openstack
status: Created
Small update
Yesterday I faced a problem with microceph(ceph rbd wan not able to unmount the instances located on 2 of 3 cluster members) and due to lack of time I decided to simply restart those 2 cluster members.
It seems the restart has resolved the issue for now but it is quite worrying to be honest…
@tomp do you have any suspects of this? Could it be that one member was somehow causing the DB not to refresh?
1.3.2023 UPDATE
No I am sure that one of cluster nodes is causing the problem.
+---------------+---------+---------------------+------+-----------+-----------+----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION |
+---------------+---------+---------------------+------+-----------+-----------+----------+
| juju-5b99f0-0 | RUNNING | 240.11.0.131 (eth1) | | CONTAINER | 0 | infra1 |
+---------------+---------+---------------------+------+-----------+-----------+----------+
| maas-ha-1 | RUNNING | 240.11.0.112 (eth2) | | CONTAINER | 0 | infra1 |
| | | 10.10.99.101 (eth3) | | | | |
| | | 10.10.10.101 (eth1) | | | | |
+---------------+---------+---------------------+------+-----------+-----------+----------+
| maas-ha-3 | RUNNING | 240.13.0.86 (eth2) | | CONTAINER | 0 | infra3 |
| | | 10.10.99.103 (eth3) | | | | |
| | | 10.10.10.103 (eth1) | | | | |
+---------------+---------+---------------------+------+-----------+-----------+----------+
| maas-ha-db-1 | RUNNING | 240.11.0.246 (eth2) | | CONTAINER | 0 | infra1 |
| | | 10.10.10.105 (eth1) | | | | |
+---------------+---------+---------------------+------+-----------+-----------+----------+
| maas-proxy-1 | RUNNING | 240.11.0.240 (eth2) | | CONTAINER | 0 | infra1 |
| | | 10.10.10.104 (eth1) | | | | |
+---------------+---------+---------------------+------+-----------+-----------+----------+
| maas-vault-1 | RUNNING | 240.12.0.194 (eth2) | | CONTAINER | 0 | infra2 |
| | | 10.10.10.107 (eth1) | | | | |
+---------------+---------+---------------------+------+-----------+-----------+----------+
| maas-vault-1 | RUNNING | 240.12.0.194 (eth2) | | CONTAINER | 0 | infra2 |
| | | 10.10.10.107 (eth1) | | | | |
+---------------+---------+---------------------+------+-----------+-----------+----------+
| maas-vault-1 | RUNNING | 240.12.0.194 (eth2) | | CONTAINER | 0 | infra2 |
| | | 10.10.10.107 (eth1) | | | | |
+---------------+---------+---------------------+------+-----------+-----------+----------+
| maas-vault-1 | RUNNING | 240.12.0.194 (eth2) | | CONTAINER | 0 | infra2 |
| | | 10.10.10.107 (eth1) | | | | |
+---------------+---------+---------------------+------+-----------+-----------+----------+
The ls works fine until it comes to the point of infra2.
@stgraber, @tomp do you have any idea how to troubleshoot the node?
Hi,
Today I observed another thing, namely, the 3 containers that have maas installed have tendency to freeze completely, what I mean by that is the fact that it is not possible neither to lxc shell into them or exec anything, the stop freezes. The only thing that works is to force stop and then start:
An example would be:
architecture: x86_64
config:
boot.autostart: "true"
image.architecture: amd64
image.description: Ubuntu jammy amd64 (20230228_07:42)
image.os: Ubuntu
image.release: jammy
image.serial: "20230228_07:42"
image.type: squashfs
image.variant: cloud
limits.cpu: "2"
limits.memory: 4GB
linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
user.network-config: |-
version: 1
config:
- type: physical
name: eth1
subnets:
- type: static
ipv4: true
address: 10.10.10.101/24
netmask: 255.255.255.0
gateway: 10.10.10.1
control: auto
- type: physical
name: eth2
subnets:
- type: dhcp
- type: physical
name: eth3
subnets:
- type: static
ipv4: true
address: 10.10.99.101/24
netmask: 255.255.255.0
control: auto
- type: nameserver
address: 1.1.1.1
user.user-data: |
#cloud-config
packages:
- snapd
- ssh
- net-tools
- jq
- nano
- traceroute
- screen
- iptables
users:
- default
volatile.base_image: d5843acf2be2772f5fc532fa2537d0e2082ea11c131ad5bb857a389c99be7e10
volatile.cloud-init.instance-id: f274c794-733b-4052-af31-a73bc194963c
volatile.eth1.host_name: mac975e636d
volatile.eth1.hwaddr: 00:16:3e:9b:ca:32
volatile.eth1.last_state.created: "false"
volatile.eth2.host_name: veth95b95baa
volatile.eth2.hwaddr: 00:16:3e:6a:bf:ff
volatile.eth3.host_name: macd9fbbb35
volatile.eth3.hwaddr: 00:16:3e:2b:1c:a3
volatile.eth3.last_state.created: "false"
volatile.idmap.base: "0"
volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.idmap: '[]'
volatile.last_state.power: RUNNING
volatile.uuid: d7bb74c2-f6b4-413e-83d9-95cb68f59c35
devices:
eth1:
name: eth1
nictype: macvlan
parent: VLAN10
type: nic
eth2:
name: eth2
network: lxdfan0
type: nic
eth3:
name: eth3
nictype: macvlan
parent: VLAN99
type: nic
root:
path: /
pool: remote
size: 25GB
type: disk
ephemeral: false
profiles:
- container-vlan10-maas
stateful: false
CONSOLE
[52192.524693] cloud-init[204]: Cloud-init v. 22.4.2-0ubuntu0~22.04.1 running 'init' at Wed, 01 Mar 2023 11:47:57 +0000. Up 17.73 seconds.
[52192.524889] cloud-init[204]: ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++
[52192.525031] cloud-init[204]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
[52192.525174] cloud-init[204]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
[52192.525320] cloud-init[204]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
[52192.525459] cloud-init[204]: ci-info: | eth1 | True | 10.10.10.102 | 255.255.255.0 | global | 00:16:3e:97:c2:36 |
[52192.525584] cloud-init[204]: ci-info: | eth1 | True | fe80::216:3eff:fe97:c236/64 | . | link | 00:16:3e:97:c2:36 |
[52192.525704] cloud-init[204]: ci-info: | eth2 | True | 240.12.0.183 | 255.0.0.0 | global | 00:16:3e:cb:4a:19 |
[52192.525823] cloud-init[204]: ci-info: | eth2 | True | fe80::216:3eff:fecb:4a19/64 | . | link | 00:16:3e:cb:4a:19 |
[52192.525944] cloud-init[204]: ci-info: | eth3 | True | 10.10.99.102 | 255.255.255.0 | global | 00:16:3e:71:e2:84 |
[52192.526083] cloud-init[204]: ci-info: | eth3 | True | fe80::216:3eff:fe71:e284/64 | . | link | 00:16:3e:71:e2:84 |
[52192.526206] cloud-init[204]: ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
[52192.526326] cloud-init[204]: ci-info: | lo | True | ::1/128 | . | host | . |
[52192.526445] cloud-init[204]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
[52192.526566] cloud-init[204]: ci-info: +++++++++++++++++++++++++++++Route IPv4 info++++++++++++++++++++++++++++++
[52192.526696] cloud-init[204]: ci-info: +-------+-------------+------------+-----------------+-----------+-------+
[52192.526818] cloud-init[204]: ci-info: | Route | Destination | Gateway | Genmask | Interface | Flags |
[52192.526936] cloud-init[204]: ci-info: +-------+-------------+------------+-----------------+-----------+-------+
[52192.527056] cloud-init[204]: ci-info: | 0 | 0.0.0.0 | 10.10.10.1 | 0.0.0.0 | eth1 | UG |
[52192.527177] cloud-init[204]: ci-info: | 1 | 0.0.0.0 | 240.12.0.1 | 0.0.0.0 | eth2 | UG |
[52192.527310] cloud-init[204]: ci-info: | 2 | 10.10.10.0 | 0.0.0.0 | 255.255.255.0 | eth1 | U |
[ OK ] Started Ubuntu Advantage Timer for running repeated jobs.
[52192.527951] cloud-init[204]: ci-info: | 3 | 10.10.99.0 | 0.0.0.0 | 255.255.255.0 | eth3 | U |
[52192.528080] cloud-init[204]: ci-info: | 4 | 240.0.0.0 | 0.0.0.0 | 255.0.0.0 | eth2 | U |
[52192.528202] cloud-init[204]: ci-info: | 5 | 240.12.0.1 | 0.0.0.0 | 255.255.255.255 | eth2 | UH |
[52192.528324] cloud-init[204]: ci-info: +-------+-------------+------------+-----------------+-----------+-------+
[52192.528462] cloud-init[204]: ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++
[52192.528585] cloud-init[204]: ci-info: +-------+-------------+---------+-----------+-------+
[52192.528754] cloud-init[204]: ci-info: | Route | Destination | Gateway | Interface | Flags |
[52192.528877] cloud-init[204]: ci-info: +-------+-------------+---------+-----------+-------+
[52192.528997] cloud-init[204]: ci-info: | 0 | fe80::/64 | :: | eth1 | U |
[52192.529117] cloud-init[204]: ci-info: | 1 | fe80::/64 | :: | eth2 | U |
[52192.529239] cloud-init[204]: ci-info: | 2 | fe80::/64 | :: | eth3 | U |
[52192.529358] cloud-init[204]: ci-info: | 4 | local | :: | eth3 | U |
[52192.529497] cloud-init[204]: ci-info: | 5 | local | :: | eth1 | U |
[52192.529618] cloud-init[204]: ci-info: | 6 | local | :: | eth2 | U |
[52192.529737] cloud-init[204]: ci-info: | 7 | multicast | :: | eth1 | U |
[52192.529871] cloud-init[204]: ci-info: | 8 | multicast | :: | eth2 | U |
[52192.529992] cloud-init[204]: ci-info: | 9 | multicast | :: | eth3 | U |
[52192.530112] cloud-init[204]: ci-info: +-------+-------------+---------+-----------+-------+
[ OK ] Reached target Timer Units.
[ OK ] Listening on cloud-init hotplug hook socket.
[ OK ] Listening on D-Bus System Message Bus Socket.
Starting Socket activation for snappy daemon...
[ OK ] Listening on Socket activation for snappy daemon.
[ OK ] Reached target Socket Units.
[ OK ] Reached target Basic System.
[ OK ] Started Regular background program processing daemon.
[ OK ] Started D-Bus System Message Bus.
[ OK ] Started Save initial kernel messages after boot.
Starting Remove Stale Online ext4 Metadata Check Snapshots...
Starting Dispatcher daemon for systemd-networkd...
Starting System Logging Service...
[ OK ] Started Service for snap application maas.supervisor.
[ OK ] Reached target Preparation for Logins.
Starting Snap Daemon...
Starting OpenBSD Secure Shell server...
Starting User Login Management...
Starting Permit User Sessions...
[ OK ] Finished Remove Stale Online ext4 Metadata Check Snapshots.
[ OK ] Finished Permit User Sessions.
[ OK ] Started Console Getty.
[ OK ] Created slice Slice /system/getty.
[ OK ] Reached target Login Prompts.
[ OK ] Started System Logging Service.
[ OK ] Started OpenBSD Secure Shell server.
[ OK ] Started User Login Management.
[ OK ] Started Unattended Upgrades Shutdown.
Starting Hostname Service...
[ OK ] Started Hostname Service.
Starting Authorization Manager...
[ OK ] Started Dispatcher daemon for systemd-networkd.
[ OK ] Started Authorization Manager.
Ubuntu 22.04.2 LTS maas-ha-2 console
The difference between these 3 containers that are freezeing and others is lack of:
raw.lxc: |
lxc.apparmor.profile=unconfined
lxc.mount.auto=proc:rw sys:rw
lxc.cap.drop=
security.nesting: "true"
security.privileged: "true"
The reason for not including it was not being able to install maas (the stupid device error during maas initialisation).
Can it have any implication?
You shouldn’t really ever use security.nesting
and security.privileged
together as that introduces a big security issue.
Also that doesn’t do what I think you think it does, in that it does remove not all AppArmor profiles from being applied.
See Problem with disable apparmor for single container - #4 by stgraber
Okay, I made myself a little bit more familiar with the documentation and now the profile for maas hosting containers looks like:
config:
boot.autostart: "true"
limits.cpu: "2"
limits.memory: 4GB
linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
user.user-data: |
#cloud-config
packages:
- snapd
- ssh
- net-tools
- jq
- nano
- traceroute
- screen
- iptables
and the profile for containers hosting mysql, vault, haproxy looks like:
config:
boot.autostart: "true"
limits.cpu: "2"
limits.memory: 4GB
linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
raw.lxc: lxc.mount.auto=proc:rw sys:rw
security.privileged: "true
I would like to thank you both for the PR link and brief explanation of the risk these options pose, but it unfortunately really does not resolve the problem
@tomp you mentioned some fixes to the DB and the fact that it’s not yet in snap.
When do you plan to implement it? It seems like the last hope
Thank you
Are you still seeing errors in the log?
Yes, it is still there
Another theory is that the pool created by microceph somehow causes the issue. I created a very similar set of 7 machines using zfs pool and ls is working fine there.
Is there any chance it is going to be merged into edge snap anytime soon?
Yes its been merged so should be in the edge snap shortly, and will be in LXD 5.12.
It seems to have resolved the issue thank you for your help