Unable to delete container that has been created without a default storage but with a storage definition in its profile

Hi,

lxc init images:debian/9 lxc4107 -p lxc4107

Will successfully create the container:

t=2019-09-09T22:46:01+0200 lvl=info msg=“Creating container” ephemeral=false name=lxc4107 project=default
t=2019-09-09T22:46:01+0200 lvl=info msg=“Created container” ephemeral=false name=lxc4107 project=default

lxc config show lxc4107

architecture: x86_64
config:
image.architecture: amd64
image.description: Debian stretch amd64 (20190909_05:24)
image.os: Debian
image.release: stretch
image.serial: “20190909_05:24”
image.type: squashfs
volatile.apply_template: create
volatile.base_image: 2afa412603c0deeebc2210e9b6b75b1861a664939d1cb2ce6d85cb89ea1343df
volatile.eth0.hwaddr: 00:16:3e:2e:60:4d
volatile.idmap.base: “0”
volatile.idmap.next: ‘[{“Isuid”:true,“Isgid”:false,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000},{“Isuid”:false,“Isgid”:true,“Hostid”:1000000,“Nsid”:0,“Maprange”:1000000000}]’
volatile.last_state.idmap: ‘[]’
devices:
root:
path: /
pool: local-filestorage
type: disk
ephemeral: false
profiles:

  • lxc4107
    stateful: false
    description: “”

lxc profile show lxc4107

config: {}
description: “”
devices:
eth0:
host_name: lxc4107
name: eth0
nictype: bridged
parent: ovsbr
type: nic
root:
path: /
pool: local-filestorage
type: disk
name: lxc4107
used_by:

  • /1.0/containers/lxc4107

Trying to delete it:

#lxc delete lxc4107 -f
Error: error removing /var/snap/lxd/common/lxd/storage-pools/local-filestorage/containers/lxc4107:

Log:

t=2019-09-09T23:19:30+0200 lvl=info msg=“Deleting container” created=2019-09-09T22:46:01+0200 ephemeral=false name=lxc4107 project=default used=1970-01-01T01:00:00+0100
t=2019-09-09T23:19:30+0200 lvl=eror msg=“Failed deleting container storage” err="error removing /var/snap/lxd/common/lxd/storage-pools/local-filestorage/containers/lxc4107: " name=lxc4107

#lxc --version
3.17

lxc storage list

±------------------±------------±-------±-------------------------------------±--------+
| NAME | DESCRIPTION | DRIVER | SOURCE | USED BY |
±------------------±------------±-------±-------------------------------------±--------+
| local-filestorage | | dir | /opt/storages/local-filestorage | 160 |
±------------------±------------±-------±-------------------------------------±--------+

ls -la /var/snap/lxd/common/lxd/storage-pools/local-filestorage

total 0
drwx–x--x 2 root root 6 Sep 9 23:10 .
drwx–x--x 3 root root 31 Aug 9 13:17 …

ls -la /opt/storages/local-filestorage

total 494466744
drwxr-xr-x 3 root root 4096 Sep 9 22:07 .
drwxr-xr-x 3 root root 31 Aug 9 13:16 …
drwxr-xr-x 83 root root 4096 Sep 9 21:49 containers

So, how to get rid of this container ? :slight_smile:

Hi,

I tried re-creating the setup you had:

lxc storage create test dir source=/opt/lxd/test
lxc launch ubuntu:18.04 c1 -s test
ls -la /opt/lxd/test/containers
lxc delete -f c1

And this worked fine, I also tried by creating a new profile that used the “test” pool and that worked fine too.

I’m going to try it with the snap package and see if this is a snap specific issue.

Can you try stopping the container first please?

Hi,

thank you for your feedback !

Of course the container is and was stopped.

Your setup is slighly different.

When running the init, i refere to an existing profile that points on an existing storage.

Your call is to create a container without using a specific profile, but using a specific storage.

So again, the szenario is:

  1. There is no default storage on the host
  2. There is no storage given at the init call
  3. With the init call we refere to a profile, that has a storage for the / filesystem

Its definitly not impossible that something is not ok with the host anyway.

When i run this command (lxc init images:debian/9 $container -p $container ) from the console, logged in via ssh, it works fine.

If i run the identical command as an ssh command ( ssh $hostnode “lxc init images:debian/9 $container -p $container” ) right nowthing will happen.

The command will be executed on the hostnode, but it doesnt do anything, while its running:

root 4978 0.0 0.0 159076 9368 ? Ss 21:23 0:00 _ sshd: root@notty
root 6757 0.0 0.0 231920 17228 ? Ssl 21:23 0:00 _ lxc init images:debian/9 lxc4107 -s local-filestorage -p lxc4107

strace will show nothing too:

strace -p 6757

strace: Process 6757 attached
read(0,

I never experienced something like this and wanted to ask if i might be stuck in an unknown bug when i created the container the first time without a default storage used in combination with the snap package.

If all does not help, i will have to reboot the host to see if this solves things.

Thanks for the extra detail, I had tried with a profile pointing to storage rather than the -s flag, but will try without a default pool being available too and see if I can recreate.

The strange thing is the error message that is being generated comes from here:

And that is just a “rm -rf” command that shouldnt fail even if the directory doesnt exist.

You’ll note as well that there is nothing after the colon in the error message, but really there should be some output from the failed command, so it looks like “rm -rf” is returning with a non-zero error code and no output to stderr.

Hi,

yes quiet strange.

I even manually created the directory:

/var/snap/lxd/common/lxd/storage-pools/local-filestorage/containers/lxc4107

to satisfy it. But that didnt matter to him. He/she/it keeps on being unhappy with him/her/itself ˆˆ;

What happens if you run as root:

rm -Rf /var/snap/lxd/common/lxd/storage-pools/local-filestorage/containers/lxc4107

Hi,

rm -Rf /var/snap/lxd/common/lxd/storage-pools/local-filestorage/containers/lxc4107

Its not there, nothing happens.

And if i create it

mkdir -p /var/snap/lxd/common/lxd/storage-pools/local-filestorage/containers/lxc4107

rm -Rf /var/snap/lxd/common/lxd/storage-pools/local-filestorage/containers/lxc4107

ls -la /var/snap/lxd/common/lxd/storage-pools/local-filestorage/containers/lxc4107

ls: cannot access ‘/var/snap/lxd/common/lxd/storage-pools/local-filestorage/containers/lxc4107’: No such file or directory

ls -la /var/snap/lxd/common/lxd/storage-pools/local-filestorage/containers/

total 0
drwxr-xr-x 2 root root 6 Sep 10 11:26 .
drwxr-xr-x 3 root root 24 Sep 10 11:26 …

So the system just behaves normally at this point

Thanks, I tried with the following setup on snap:

sudo lxc storage list
+------+-------------+--------+---------------+---------+
| NAME | DESCRIPTION | DRIVER |    SOURCE     | USED BY |
+------+-------------+--------+---------------+---------+
| test |             | dir    | /opt/lxd/test | 2       |
+------+-------------+--------+---------------+---------+
sudo lxc profile show test
config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr1
    type: nic
  root:
    path: /
    pool: test
    type: disk
name: test
used_by:
- /1.0/containers/c1
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 18.04 LTS amd64 (release) (20190813.1)
  image.label: release
  image.os: ubuntu
  image.release: bionic
  image.serial: "20190813.1"
  image.type: squashfs
  image.version: "18.04"
  volatile.base_image: 2dd611e2689a8efc45807bd2a86933cf2da0ffc768f57814724a73b5db499eac
  volatile.eth0.host_name: vethbb728ff9
  volatile.eth0.hwaddr: 00:16:3e:74:5e:23
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
devices: {}
ephemeral: false
profiles:
- test
stateful: false
description: ""

It deleted fine with lxc delete -f c1. What distro are you using? Are there any errors in the non-LXD logs?

The reason I ask that is that there is a known issue with shared.PathExists function that means if it gets any other error from the OS except “not exists” then it returns true, indicating the folder does exist, even if it actually doesn’t and the OS has just denied access to it for some reason. This would then trigger the rm -Rf command which then may also fail for the same reason.

So that error message needs to be changed as with the recent fixes to RunCommand, stderr is in err not in output. I’ll send something that at least makes the error handling more reasonable so we can tell what’s going on.