All files in container have the 1000000 uid

Hello,

I migrated to incus a while back. It worked as intended. I haven’t had to create a new container since yesterday. When I did, it failed with this issue. As my /etc/subuid and /etc/subgid were messy, I cleared them and added root:1000000:1000000000 to them.

So far so good I could create new containers. I then rebooted the host and one container would not start with error :

level=error msg="Failed to auto start instance" err="Failed to handle idmapped storage: invalid argument - Failed to change ACLs on /var/lib/incus/storage-pools/default/containers/hass/rootfs/var/log/journal" instance=hass project=default

So I mounted the ZFS dataset and deleted that /rootfs/var/log/journal directory. Started the container again, after remapping, it started.

I now have an issue with all files with numeric UID. For example instead of root:root, I have 1000000:1000000.

Here is the container’s config before the remapping :

architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 20.04 LTS amd64 (release) (20200720)
  image.label: release
  image.os: ubuntu
  image.release: focal
  image.serial: "20200720"
  image.version: "20.04"
  security.nesting: "true"
  security.syscalls.intercept.mknod: "true"
  security.syscalls.intercept.setxattr: "true"
  user.network-config: "  \n  version: 2\n  ethernets:\n    eth0:\n      dhcp6: no\n
    \     dhcp4: yes"
  user.user-data: |-
    #cloud-config
    runcmd:
      - hostnamectl set-hostname hass.[readacted domain]
      - echo "postfix postfix/mailname string hass.[readacted domain]" | debconf-set-selections
      - echo "postfix postfix/main_mailer_type string 'Internet Site'" | debconf-set-selections
      - apt-get install --assume-yes mailutils
      - postconf -e "inet_interfaces = loopback-only"
      - echo root:maintenance@[readacted domain] >> /etc/aliases
      - newaliases
      - systemctl reload postfix
      - [sh, -c, "cat >> /root/.bashrc <<EOF\nif [ -f /etc/bash_completion ] && ! shopt -oq posix;\nthen\n        . /etc/bash_completion\nfi\n\nEOF" ]
  volatile.base_image: 0a4f3d88ed1c0e0d34c0f1e9be71b5dd73dc3de81a1e139b0ecd4e0faa958a30
  volatile.cloud-init.instance-id: 24b87d5a-d348-4a6d-8e19-8c6a1ad3ac3c
  volatile.eth0.hwaddr: 00:16:3e:21:4f:11
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.power: RUNNING
  volatile.uuid: 4b899469-e03b-4370-890e-a46f5c8147ef
  volatile.uuid.generation: 4b899469-e03b-4370-890e-a46f5c8147ef
devices:
  eth0:
    ipv4.address: 10.39.199.26
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
ephemeral: false
profiles:
- default
- ubuntu
stateful: false
description: ""

Here is it now :

# incus config show --expanded hass
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 20.04 LTS amd64 (release) (20200720)
  image.label: release
  image.os: ubuntu
  image.release: focal
  image.serial: "20200720"
  image.version: "20.04"
  security.nesting: "true"
  security.syscalls.intercept.mknod: "true"
  security.syscalls.intercept.setxattr: "true"
  user.network-config: "  \n  version: 2\n  ethernets:\n    eth0:\n      dhcp6: no\n
    \     dhcp4: yes"
  user.user-data: |-
    #cloud-config
    runcmd:
      - hostnamectl set-hostname hass.[readacted domain]
      - echo "postfix postfix/mailname string hass.[readacted domain]" | debconf-set-selections
      - echo "postfix postfix/main_mailer_type string 'Internet Site'" | debconf-set-selections
      - apt-get install --assume-yes mailutils
      - postconf -e "inet_interfaces = loopback-only"
      - echo root:maintenance@[readacted domain] >> /etc/aliases
      - newaliases
      - systemctl reload postfix
      - [sh, -c, "cat >> /root/.bashrc <<EOF\nif [ -f /etc/bash_completion ] && ! shopt -oq posix;\nthen\n        . /etc/bash_completion\nfi\n\nEOF" ]
  user.vendor-data: |
    #cloud-config
    locale: fr_FR.UTF-8
    timezone: Pacific/Noumea
    ## doing only update until package cloud-init is updated
    ## see: https://github.com/canonical/cloud-init/issues/5143
    package_update: true
    # package_upgrade: true
    ntp:
      enabled: true
      ntp_client: systemd-timesyncd
      servers:
        - 0.oceania.pool.ntp.org
        - 1.oceania.pool.ntp.org
        - 2.oceania.pool.ntp.org
        - 3.oceania.pool.ntp.org
        - pool.ntp.org
        - ntp.ubuntu.com
    apt:
      primary:
        - arches: [default]
          uri: http://nc.archive.ubuntu.com/ubuntu
      conf: | # APT config
        Unattended-Upgrade::Allowed-Origins {
          "${distro_id}:${distro_codename}";
          "${distro_id}:${distro_codename}-security";
          // Extended Security Maintenance; doesn't necessarily exist for
          // every release and this system may not have it installed, but if
          // available, the policy for updates is such that unattended-upgrades
          // should also install from here by default.
          "${distro_id}ESMApps:${distro_codename}-apps-security";
          "${distro_id}ESM:${distro_codename}-infra-security";
          "${distro_id}:${distro_codename}-updates";
          "${distro_id}:${distro_codename}-proposed";
          "${distro_id}:${distro_codename}-backports";
        };
        APT::Periodic::Update-Package-Lists "1";
        APT::Periodic::Download-Upgradeable-Packages "1";
        APT::Periodic::AutocleanInterval "7";
        APT::Periodic::Unattended-Upgrade "1";
    runcmd:
      - [sh, -c, "cat >> /root/.bashrc <<EOF\nif [ -f /etc/bash_completion ] && ! shopt -oq posix;\nthen\n        . /etc/bash_completion\nfi\n\nEOF" ]
  volatile.base_image: 0a4f3d88ed1c0e0d34c0f1e9be71b5dd73dc3de81a1e139b0ecd4e0faa958a30
  volatile.cloud-init.instance-id: 24b87d5a-d348-4a6d-8e19-8c6a1ad3ac3c
  volatile.eth0.host_name: veth1c5f09ca
  volatile.eth0.hwaddr: 00:16:3e:21:4f:11
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.last_state.ready: "false"
  volatile.uuid: 4b899469-e03b-4370-890e-a46f5c8147ef
  volatile.uuid.generation: 4b899469-e03b-4370-890e-a46f5c8147ef
devices:
  eth0:
    ipv4.address: 10.39.199.26
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
- ubuntu
stateful: false
description: ""

How can I please get out of this mess ?

hmm, so you ended up double shifting basically.
Can you stop the container, mount its dataset and show ls -lh /path/to/dataset so we can look at what the ownership looks like from the host’s point of view?

Here you go :

# ls -lah /mnt
total 73K
d--x------+  4 root    root       6 juil. 30  2020 .
drwxr-xr-x  24 root    root    4,0K août  23 06:52 ..
-r--------   1 root    root    9,2K août  24 10:04 backup.yaml
-rw-r--r--   1 root    root    1,1K juil. 21  2020 metadata.yaml
drwxr-xr-x  18 2000000 2000000   24 juil. 21  2020 rootfs
drwxr-xr-x   2 root    root       7 juil. 21  2020 templates

Can you show ls -lh rootfs/? I want to see if you’re partially shifted.

Was about to, here it is :

# ls -lah /mnt/rootfs/
total 232K
drwxr-xr-x   18 2000000 2000000  24 juil. 21  2020 .
d--x------+   4 root    root      6 juil. 30  2020 ..
lrwxrwxrwx    1 2000000 2000000   7 juil. 21  2020 bin -> usr/bin
drwxr-xr-x    2 2000000 2000000   2 juil. 21  2020 boot
drwxr-xr-x    5 2000000 2000000  17 juil. 21  2020 dev
drwxr-xr-x  107 2000000 2000000 203 août  24 09:18 etc
drwxr-xr-x    4 2000000 2000000   4 juil. 30  2020 home
lrwxrwxrwx    1 2000000 2000000   7 juil. 21  2020 lib -> usr/lib
lrwxrwxrwx    1 2000000 2000000   9 juil. 21  2020 lib32 -> usr/lib32
lrwxrwxrwx    1 2000000 2000000   9 juil. 21  2020 lib64 -> usr/lib64
lrwxrwxrwx    1 2000000 2000000  10 juil. 21  2020 libx32 -> usr/libx32
drwxr-xr-x    3 2000000 2000000   3 mars  22  2021 media
drwxr-xr-x    2 2000000 2000000   2 juil. 21  2020 mnt
drwxr-xr-x    2 2000000 2000000   2 juil. 21  2020 opt
drwxr-xr-x    2 2000000 2000000   2 avril 15  2020 proc
drwx------   12 2000000 2000000  16 sept. 17  2023 root
drwxr-xr-x    2 2000000 2000000   2 juil. 21  2020 run
lrwxrwxrwx    1 2000000 2000000   8 juil. 21  2020 sbin -> usr/sbin
drwxr-xr-x    8 2000000 2000000   9 janv. 28  2024 snap
drwxr-xr-x    6 2000000 2000000   6 juin  14  2023 srv
drwxr-xr-x    2 1000000 1000000   2 avril 15  2020 sys
drwxrwxrwt    3 1000000 1000000   3 août  24 10:11 tmp
drwxr-xr-x   14 1000000 1000000  14 févr.  3  2022 usr
drwxr-xr-x   13 1000000 1000000  15 juil. 21  2020 var

More :

# ls -lah /mnt/rootfs/home/
total 36K
drwxr-xr-x  4 2000000 2000000  4 juil. 30  2020 .
drwxr-xr-x 18 2000000 2000000 24 juil. 21  2020 ..
drwxr-xr-x 10 2000997 2000997 20 juil. 25 20:53 homeassistant
drwxr-xr-x  3 2001000 2001000  6 juil. 30  2020 ubuntu
# cat /mnt/rootfs/etc/group
root:x:0:
daemon:x:1:
bin:x:2:
sys:x:3:
adm:x:4:syslog,ubuntu
tty:x:5:syslog
disk:x:6:
lp:x:7:
mail:x:8:
news:x:9:
uucp:x:10:
man:x:12:
proxy:x:13:
kmem:x:15:
dialout:x:20:ubuntu,homeassistant
fax:x:21:
voice:x:22:
cdrom:x:24:ubuntu
floppy:x:25:ubuntu
tape:x:26:
sudo:x:27:ubuntu
audio:x:29:ubuntu
dip:x:30:ubuntu
www-data:x:33:
backup:x:34:
operator:x:37:
list:x:38:
irc:x:39:
src:x:40:
gnats:x:41:
shadow:x:42:
utmp:x:43:
video:x:44:ubuntu
sasl:x:45:
plugdev:x:46:ubuntu
staff:x:50:
games:x:60:
users:x:100:
nogroup:x:65534:
systemd-journal:x:101:
systemd-network:x:102:
systemd-resolve:x:103:
systemd-timesync:x:104:
crontab:x:105:
messagebus:x:106:
input:x:107:
kvm:x:108:
render:x:109:
syslog:x:110:
tss:x:111:
uuidd:x:112:
tcpdump:x:113:
ssh:x:114:
landscape:x:115:
admin:x:116:
netdev:x:117:ubuntu
lxd:x:118:ubuntu
systemd-coredump:x:999:
ubuntu:x:1000:
ssl-cert:x:119:
postfix:x:120:
postdrop:x:121:
homeassistant:x:997:
midnite-modbusd:x:996:
fwupd-refresh:x:122:

I mounted a snapshot. Here is how it was :

# ls -lah /mnt
total 73K
d--x------+  4 100000 root      6 juil. 30  2020 .
drwxr-xr-x  24 root   root   4,0K août  23 06:52 ..
-r--------   1 root   root   8,4K août  22 09:05 backup.yaml
-rw-r--r--   1 root   root   1,1K juil. 21  2020 metadata.yaml
drwxr-xr-x  18 100000 100000   24 juil. 21  2020 rootfs
drwxr-xr-x   2 root   root      7 juil. 21  2020 templates
# ls -lah /mnt/rootfs/
total 232K
drwxr-xr-x   18 100000 100000  24 juil. 21  2020 .
d--x------+   4 100000 root     6 juil. 30  2020 ..
lrwxrwxrwx    1 100000 100000   7 juil. 21  2020 bin -> usr/bin
drwxr-xr-x    2 100000 100000   2 juil. 21  2020 boot
drwxr-xr-x    5 100000 100000  17 juil. 21  2020 dev
drwxr-xr-x  107 100000 100000 203 août  20 18:42 etc
drwxr-xr-x    4 100000 100000   4 juil. 30  2020 home
lrwxrwxrwx    1 100000 100000   7 juil. 21  2020 lib -> usr/lib
lrwxrwxrwx    1 100000 100000   9 juil. 21  2020 lib32 -> usr/lib32
lrwxrwxrwx    1 100000 100000   9 juil. 21  2020 lib64 -> usr/lib64
lrwxrwxrwx    1 100000 100000  10 juil. 21  2020 libx32 -> usr/libx32
drwxr-xr-x    3 100000 100000   3 mars  22  2021 media
drwxr-xr-x    2 100000 100000   2 juil. 21  2020 mnt
drwxr-xr-x    2 100000 100000   2 juil. 21  2020 opt
drwxr-xr-x    2 100000 100000   2 avril 15  2020 proc
drwx------   12 100000 100000  16 sept. 17  2023 root
drwxr-xr-x    2 100000 100000   2 juil. 21  2020 run
lrwxrwxrwx    1 100000 100000   8 juil. 21  2020 sbin -> usr/sbin
drwxr-xr-x    8 100000 100000   9 janv. 28  2024 snap
drwxr-xr-x    6 100000 100000   6 juin  14  2023 srv
drwxr-xr-x    2 100000 100000   2 avril 15  2020 sys
drwxrwxrwt   11 100000 100000  11 août  22 10:55 tmp
drwxr-xr-x   14 100000 100000  14 févr.  3  2022 usr
drwxr-xr-x   13 100000 100000  15 juil. 21  2020 var

Is that a recent enough snapshot that you could revert to that, remove the problematic file and then start things back up or do you have data that’s not in that snapshot?

Your current state can be recovered but will effectively need to be unshifted back to host values and then let incus shift back to the allocated range.

Yes I have recent snapshots done before the problematic remapping.

Okay, then I’d suggest doing incus snapshot restore to that snapshot, then before starting the instance, go ahead and mount the dataset and delete that journal file, then start the container.

I have two “types” of snapshot. The olders with:

drwxr-xr-x  18 100000 100000   24 juil. 21  2020 rootfs

More recent with :

drwxr-xr-x  18 1000000 1000000   24 juil. 21  2020 rootfs

Which one shall I use please ?

By the way I don’t have incus snapshots but zfs snapshots …

# incus snapshot list hass
+------+----------+------------+----------+
| NAME | TAKEN AT | EXPIRES AT | STATEFUL |
+------+----------+------------+----------+

Those are from 2020, that doesn’t sound recent at all!

That is when the container got created, but these are those ZFS snapshots (GMT time):

ssd/lxd/containers/hass@autosnap_2024-08-23_08:00:15_hourly                                          0B      -     17,3G  -
ssd/lxd/containers/hass@autosnap_2024-08-23_09:00:15_hourly                                          0B      -     17,3G  -
# mount -t zfs ssd/lxd/containers/hass@autosnap_2024-08-23_08:00:15_hourly /mnt && ls -lah /mnt && umount /mnt
total 73K
d--x------+  4 root    root       6 juil. 30  2020 .
drwxr-xr-x  24 root    root    4,0K août  23 06:52 ..
-r--------   1 root    root    9,1K août  23 13:25 backup.yaml
-rw-r--r--   1 root    root    1,1K juil. 21  2020 metadata.yaml
drwxr-xr-x  18 1000000 1000000   24 juil. 21  2020 rootfs
drwxr-xr-x   2 root    root       7 juil. 21  2020 templates
# mount -t zfs ssd/lxd/containers/hass@autosnap_2024-08-23_09:00:15_hourly /mnt && ls -lah /mnt && umount /mnt
total 73K
d--x------+  4 root    root       6 juil. 30  2020 .
drwxr-xr-x  24 root    root    4,0K août  23 06:52 ..
-r--------   1 root    root    9,1K août  23 13:25 backup.yaml
-rw-r--r--   1 root    root    1,1K juil. 21  2020 metadata.yaml
drwxr-xr-x  18 1000000 1000000   24 juil. 21  2020 rootfs
drwxr-xr-x   2 root    root       7 juil. 21  2020 templates

Ah right, that makes more sense :slight_smile:
I’d probably try the newest since ZFS doesn’t let you rollback anything but the most recent anyway.

So far, so good. It just worked straight away without remapping from Incus. Thank you for being so helpful.

# zfs rollback -r ssd/lxd/containers/hass@autosnap_2024-08-23_09:00:15_hourly
# incus start hass
# incus shell hass
root@hass:~# ls -lah
total 104K
drwx------ 12 root root   16 sept. 17  2023 .
drwxr-xr-x 18 root root   24 juil. 21  2020 ..
-rw-------  1 root root  31K juil. 26 00:57 .bash_history
-rw-r--r--  1 root root 3,2K juil. 30  2020 .bashrc

Rebooted, and it is working as intended.