Trouble deleting containers on BTRFS

I was tracking an issue that I can’t find right now… semi-related to https://github.com/lxc/lxd/issues/3775 but was more about checking if the btrfs subvolume exists before deleting…

I ran across a similar issue in 2.18 and have been waiting for the fix to come through in 2.19… well with the repo freeze, my metal never saw 2.19, but I finally got 2.20 the other day and the issue still exists so that didn’t fix it…

The issue that I’m facing only appears to happen when I delete a container through the REST api… I’ve never seen it happen when I delete via the CLI.

And it’s sporadic, but frequent… ‘sometimes’ I can delete without error, but usually the REST api pops a 400. After which, the container is actually deleted, and the subvolume is actually gone.

So again, my metal is 2.20 with btrfs storage… reproducing now…

lxc info:

root@wyzsrv:~# lxc info
config:
  core.https_address: '[::]:8443'
  core.trust_password: true
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
  addresses:
  - 192.168.1.75:8443
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIIFQTCCAymgAwIBAgIRALo7K41xPnu2nKOEJiWH3jowDQYJKoZIhvcNAQELBQAw
    NDEcMBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzEUMBIGA1UEAwwLcm9vdEB3
    eXpzcnYwHhcNMTcxMDE4MTgxNDI4WhcNMjcxMDE2MTgxNDI4WjA0MRwwGgYDVQQK
    ExNsaW51eGNvbnRhaW5lcnMub3JnMRQwEgYDVQQDDAtyb290QHd5enNydjCCAiIw
    DQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAMBYv9/zg+yPeBR/l+wco1hkr6Zl
    TVaBamgcCbySoYYqoBCQUiRig/37gzESMJScFLtYPAelPZKbUlo+1gldr3eKXgG8
    1rK2CWRuTj0oCAgknZC4fQWObDEHJKqvr7BjZRNJachs7fz+pMtoH6zg3NSmsFnU
    +LuRuk+tSozpTpqBBjsBwXlAL+nWAr6UfjreiZamqC1kVb/gMYKvqYaTUQ7+uv7B
    DR4ABNsOarkGiAXmgfUevIgXnDkekOIC8JybrxNvRJn2ifyPI4AMwH2R9l2KUX7j
    K8w7pKhzsfyVxeLKFAdD8m5QyGiaAWmIumNLX+t6nKATK2RToU2NtVlWZNePVp6D
    3PER8dCXJ29ek3e5vYN2wAX9ixGScTJVk/aG7Jrij5mY538IsRb+BMm+U1S5NQVw
    iWMl3HYyo24Fc3E3+SNDDL/D+hewU0Zp7KMCu0ACte+ltXjDIr84FnV+hWAmru4d
    PWELhhqoRBZQG+3WGKDyuz0gXZmjlLa+PQKCFQRmy35DAqa0+NTzwzjhfWCpQHc4
    84DiWZ7FSX/yObKk0ts0h9sBU/k0/NRnvlbBtts8dujbVcBBghyvbXlIaV0GpHCJ
    Z24iPjVrmLHTdd+0mKaFHnJ8P+Fyesnghl4DkznurcRPt3f15wGfgkQEiGeTo4IE
    Sn0rDo7T7muwIg1XAgMBAAGjTjBMMA4GA1UdDwEB/wQEAwIFoDATBgNVHSUEDDAK
    BggrBgEFBQcDATAMBgNVHRMBAf8EAjAAMBcGA1UdEQQQMA6CBnd5enNydocEwKgB
    SzANBgkqhkiG9w0BAQsFAAOCAgEAai3yF7lYVUwH08NWXFWwYyxuApLxWtDjYEC4
    5Titd5w1SERkznAWEe6G88q7InDqIF5w6z7In46mMjfd8BPSdV0+0Z0m2IYBul7L
    rvYEuT3vywJWNZuidzFLacLQVoMoQiAK6eFUWZ13s9NDxDXOw34fLIEoDmylFf/W
    ilZ/4He8HFRCikFG67F1qrVTJw0ODDdDjuuWqsl9pzjsHDub5qpMFQvyMMRE76lJ
    fnjGylEtrUpUxIfwtd/bU9iEFU6NNvaM/eZCkL5C3hK6Qihj8MIG/e79b4FXUt8U
    bHTFZyjIy7QuQN6tJYp+PqjwNCtnoHXP36mCTp+0eTR5g9o0XkFNzR+FaBYtHNBA
    4O44EQDOxfrOwaGcC5HvSQWLjVSZwbvcZXTZ65ZFJgkQtwaCwUmgTlAY0QcqA+Mz
    l3bC5SDzLa6tf/xYFFs6aJzsOJemgUkeZDWCgYdDX/53NZDd6ju19/V3H0k4JhKW
    zyy3ZdnWMYkV1s70WB/gScoOzLPNJgD3Ai4YL036xRxXE1xt6zqa7z10F54xQUVL
    T5r3HB3x/FFjW+ipUAFVGMV7fP4HCnpcVks2qX1I9PE6bVACaLf2pFSUKhPn5dSq
    ZgAQeCLDCwDNEIg1NsLxAEWnAHusytGbCnXfOgxi8rY8zgbcoRkTfR69OTpZbEOB
    z8Ls2V8=
    -----END CERTIFICATE-----
  certificate_fingerprint: 332a47ba4f50d5365fb13fb2fc16281f14a577bc3b9e22a42f970908707b8e08
  driver: lxc
  driver_version: 2.0.8
  kernel: Linux
  kernel_architecture: x86_64
  kernel_version: 4.4.0-101-generic
  server: lxd
  server_pid: 1785
  server_version: "2.20"
  storage: btrfs
  storage_version: "4.4"

so this next command loops through ruby test-kitchen, & through my driver that I’m writing which via the REST api does a force stop, followed by a delete… no real diagnostic info available without rewriting something… though my CI on the submodules may be more helpful - let me know if needed. (That’s how I know it’s a 400 at least)

C:\Users\Sean\Documents\projects\lxd_nexus>kitchen destroy 16
-----> Starting Kitchen (v1.17.0)
WARN: Unresolved specs during Gem::Specification.reset:
      minitest (~> 5.1)
      rake (>= 0)
WARN: Clearing out unresolved specs.
Please report a bug if this causes problems.
-----> Destroying <lxd-ubuntu-1604>...
       Utilizing REST interface at https://wyzsrv:8443
>>>>>> ------Exception-------
>>>>>> Class: Kitchen::ActionFailed
>>>>>> Message: 1 actions failed.
>>>>>>     Failed to complete #destroy action: [An existing connection was forcibly closed by the remote host.] on lxd-ubuntu-1604
>>>>>> ----------------------
>>>>>> Please see .kitchen/logs/kitchen.log for more details
>>>>>> Also try running `kitchen diagnose --all` for configuration

and on the host we get:

root@wyzsrv:~# systemctl status lxd
● lxd.service - LXD - main daemon
   Loaded: loaded (/lib/systemd/system/lxd.service; indirect; vendor preset: enabled)
   Active: active (running) since Fri 2017-11-24 12:59:55 MST; 7h ago
     Docs: man:lxd(1)
  Process: 1789 ExecStartPost=/usr/bin/lxd waitready --timeout=600 (code=exited, status=0/SUCCESS)
  Process: 1758 ExecStartPre=/usr/lib/x86_64-linux-gnu/lxc/lxc-apparmor-load (code=exited, status=0/SUCCESS)
 Main PID: 1785 (lxd)
    Tasks: 20
   Memory: 123.4M
      CPU: 53.441s
   CGroup: /system.slice/lxd.service
           ├─1785 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log
           ├─1956 [lxc monitor] /var/lib/lxd/containers test
           ├─2105 [lxc monitor] /var/lib/lxd/containers test2
           └─4938 [lxc monitor] /var/lib/lxd/containers lxd-ubuntu-1404-25cb89ff3f43e9e4

Nov 24 12:59:52 wyzsrv systemd[1]: Starting LXD - main daemon...
Nov 24 12:59:53 wyzsrv lxd[1785]: lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored." t=2017-11-24T12
Nov 24 12:59:55 wyzsrv systemd[1]: Started LXD - main daemon.
Nov 24 19:58:13 wyzsrv lxd[1785]: err="Failed to run: btrfs subvolume delete /var/lib/lxd/storage-pools/default/containers/lxd-ubuntu-16
lines 1-20/20 (END)

and again the container is deleted and the subvolume is properly deleted

what more log output might you need for this one?

the last couple of minutes of dmesg:

[23132.688550] audit: type=1400 audit(1511576701.616:88): apparmor="STATUS" operation="profile_replace" label="lxd-lxd-ubuntu-1604-e47973768491c51c_</var/lib/lxd>//&:lxd-lxd-ubuntu-1604-e47973768491c51c_<var-lib-lxd>://unconfined" name="lxc-container-default" pid=9006 comm="apparmor_parser"
[23132.688598] audit: type=1400 audit(1511576701.616:89): apparmor="STATUS" operation="profile_replace" label="lxd-lxd-ubuntu-1604-e47973768491c51c_</var/lib/lxd>//&:lxd-lxd-ubuntu-1604-e47973768491c51c_<var-lib-lxd>://unconfined" name="lxc-container-default-cgns" pid=9006 comm="apparmor_parser"
[23132.688629] audit: type=1400 audit(1511576701.616:90): apparmor="STATUS" operation="profile_replace" label="lxd-lxd-ubuntu-1604-e47973768491c51c_</var/lib/lxd>//&:lxd-lxd-ubuntu-1604-e47973768491c51c_<var-lib-lxd>://unconfined" name="lxc-container-default-with-mounting" pid=9006 comm="apparmor_parser"
[23132.688660] audit: type=1400 audit(1511576701.616:91): apparmor="STATUS" operation="profile_replace" label="lxd-lxd-ubuntu-1604-e47973768491c51c_</var/lib/lxd>//&:lxd-lxd-ubuntu-1604-e47973768491c51c_<var-lib-lxd>://unconfined" name="lxc-container-default-with-nesting" pid=9006 comm="apparmor_parser"
[23161.739500] cgroup: new mount options do not match the existing superblock, will be ignored
[23169.560301] audit: type=1400 audit(1511576738.488:92): apparmor="STATUS" operation="profile_load" label="lxd-lxd-ubuntu-1404-25cb89ff3f43e9e4_</var/lib/lxd>//&:lxd-lxd-ubuntu-1404-25cb89ff3f43e9e4_<var-lib-lxd>://unconfined" name="/usr/lib/lxd/lxd-bridge-proxy" pid=9912 comm="apparmor_parser"
[23171.120233] audit: type=1400 audit(1511576740.048:93): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-lxd-ubuntu-1404-25cb89ff3f43e9e4_<var-lib-lxd>" profile="/usr/lib/lxd/lxd-bridge-proxy" name="/dev/pts/1" pid=10003 comm="lxd-bridge-prox" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0
[23171.120279] audit: type=1400 audit(1511576740.048:94): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-lxd-ubuntu-1404-25cb89ff3f43e9e4_<var-lib-lxd>" profile="/usr/lib/lxd/lxd-bridge-proxy" name="/dev/pts/1" pid=10003 comm="lxd-bridge-prox" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0
[23197.471224] audit: type=1400 audit(1511576766.396:95): apparmor="STATUS" operation="profile_load" label="lxd-lxd-ubuntu-1404-25cb89ff3f43e9e4_</var/lib/lxd>//&:lxd-lxd-ubuntu-1404-25cb89ff3f43e9e4_<var-lib-lxd>://unconfined" name="/usr/bin/lxc-start" pid=10244 comm="apparmor_parser"
[23205.167502] audit: type=1400 audit(1511576774.092:96): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-lxd-ubuntu-1404-25cb89ff3f43e9e4_<var-lib-lxd>" profile="/usr/lib/lxd/lxd-bridge-proxy" name="/dev/pts/1" pid=10341 comm="lxd-bridge-prox" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0
[23205.167555] audit: type=1400 audit(1511576774.092:97): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-lxd-ubuntu-1404-25cb89ff3f43e9e4_<var-lib-lxd>" profile="/usr/lib/lxd/lxd-bridge-proxy" name="/dev/pts/1" pid=10341 comm="lxd-bridge-prox" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0
[25120.469568] br0: port 4(vethX9Q14M) entered disabled state
[25120.469970] device vethX9Q14M left promiscuous mode
[25120.469979] br0: port 4(vethX9Q14M) entered disabled state
[25121.507884] audit: type=1400 audit(1511578690.433:98): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="lxd-lxd-ubuntu-1604-e47973768491c51c_</var/lib/lxd>" pid=10442 comm="apparmor_parser"
[25124.005352] BTRFS error (device dm-4): could not find root 8
[25124.011810] BTRFS error (device dm-4): could not find root 8

that has me curious… but I’m not dealing with nested containers at this level, so the one recommended btrfs option shouldn’t apply, but it is present:

root@wyzsrv:~# lxc storage show default
config:
  btrfs.mount_options: user_subvol_rm_allowed
  source: e3818fb3-db3d-4b7f-afbb-4ba84d0eb87e
  volatile.initial_source: /dev/storage/lxd-storage
description: ""
name: default
driver: btrfs
used_by:
- /1.0/containers/lxd-ubuntu-1404-25cb89ff3f43e9e4
- /1.0/containers/test
- /1.0/containers/test2
- /1.0/images/347f49fcb4ceada500d1bc53e0146b48b4e39074ef5895b184902d333120d5ed
- /1.0/images/5f364e2e3f460773a79e9bec2edb5e993d236f035f70267923d43ab22ae3bb62
- /1.0/images/d342a270e53280a608e747fb108b66a4687b9f0d4bc9e5532bd11eaf9427381e
- /1.0/profiles/default
root@wyzsrv:~#

note: my above delete command was on ubuntu "16"04… the present "14"04 is expected

and yes my btrfs volume is held under lvm:

root@wyzsrv:~# pvs
v  PV         VG      Fmt  Attr PSize   PFree
  /dev/md2   storage lvm2 a--   55.83g    0
  /dev/md3   storage lvm2 a--  590.21g    0
root@wyzsrv:~# vgs
  VG      #PV #LV #SN Attr   VSize   VFree
  storage   2   3   0 wz--n- 646.04g    0
root@wyzsrv:~# lvs
  LV           VG      Attr       LSize   Pool     Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lxd-storage  storage Vwi-aotz--   1.00t thinpool        0.41
  test-storage storage Vwi-a-tz--   1.00t thinpool        0.00
  thinpool     storage twi-aotz-- 645.88g                 0.65   0.73
root@wyzsrv:~#

now there was a time during these last couple of months that I had a wonder… The error seemed to never happen when I deleted the 14.04 container… and mostly when deleting the 16.04. but more recently I have actually seen it happen while deleting the 14.04, though it’s still more rare.

The difference leading to my wonder being that the nested 14.04 LXD host inside the container does not recognize the BTRFS subsystem and uses the DIR driver, whereas the nested 16.04 does recognize btrfs. But again in this context, nesting isn’t coming into play. There are no subcontainers.

of course - this one might help…

root@wyzsrv:/var/log/lxd# tail lxd.log.1
created=2017-11-25T02:21:14+0000 ephemeral=false lvl=info msg="Deleting container" name=lxd-ubuntu-1604-e47973768491c51c t=2017-11-24T19:58:10-0700 used=2017-11-25T02:21:19+0000
created=2017-11-25T02:21:14+0000 ephemeral=false lvl=info msg="Deleting container" name=lxd-ubuntu-1604-e47973768491c51c t=2017-11-24T19:58:10-0700 used=2017-11-25T02:21:19+0000
err="Failed to run: btrfs subvolume delete /var/lib/lxd/storage-pools/default/containers/lxd-ubuntu-1604-e47973768491c51c: ERROR: cannot delete '/var/lib/lxd/storage-pools/default/containers/lxd-ubuntu-1604-e47973768491c51c': No such file or directory\nDelete subvolume (no-commit): '/var/lib/lxd/storage-pools/default/containers/lxd-ubuntu-1604-e47973768491c51c'" lvl=eror msg="Failed deleting container storage" name=lxd-ubuntu-1604-e47973768491c51c t=2017-11-24T19:58:13-0700
created=2017-11-25T02:21:14+0000 ephemeral=false lvl=info msg="Deleted container" name=lxd-ubuntu-1604-e47973768491c51c t=2017-11-24T19:58:13-0700 used=2017-11-25T02:21:19+0000
lvl=info msg="Updating images" t=2017-11-25T01:01:07-0700
alias=ubuntu/xenial lvl=info msg="Downloading image" server=https://images.linuxcontainers.org t=2017-11-25T01:01:46-0700
alias=ubuntu/xenial lvl=info msg="Image downloaded" server=https://images.linuxcontainers.org t=2017-11-25T01:02:27-0700
lvl=info msg="Done updating images" t=2017-11-25T01:02:29-0700
lvl=info msg="Updating images" t=2017-11-25T07:02:29-0700
lvl=info msg="Done updating images" t=2017-11-25T07:03:04-0700
root@wyzsrv:/var/log/lxd#

and the first bit of my kitchen.log showing the actual exception stemming from the socket closure…

I, [2017-11-24T19:58:07.758544 #10296]  INFO -- lxd-ubuntu-1604: -----> Destroying <lxd-ubuntu-1604>...
I, [2017-11-24T19:58:07.763546 #10296]  INFO -- lxd-ubuntu-1604: Utilizing REST interface at https://wyzsrv:8443
E, [2017-11-24T19:58:10.144867 #10296] ERROR -- lxd-ubuntu-1604: Destroy failed on instance <lxd-ubuntu-1604>.
E, [2017-11-24T19:58:10.145367 #10296] ERROR -- lxd-ubuntu-1604: ------Exception-------
E, [2017-11-24T19:58:10.145868 #10296] ERROR -- lxd-ubuntu-1604: Class: Faraday::ConnectionFailed
E, [2017-11-24T19:58:10.145868 #10296] ERROR -- lxd-ubuntu-1604: Message: An existing connection was forcibly closed by the remote host.
E, [2017-11-24T19:58:10.145868 #10296] ERROR -- lxd-ubuntu-1604: ----------------------
E, [2017-11-24T19:58:10.146868 #10296] ERROR -- lxd-ubuntu-1604: ------Backtrace-------
E, [2017-11-24T19:58:10.148869 #10296] ERROR -- lxd-ubuntu-1604: C:/opscode/chefdk/embedded/lib/ruby/2.4.0/openssl/buffering.rb:182:in `sysread_nonblock'
E, [2017-11-24T19:58:10.149370 #10296] ERROR -- lxd-ubuntu-1604: C:/opscode/chefdk/embedded/lib/ruby/2.4.0/openssl/buffering.rb:182:in `read_nonblock'
E, [2017-11-24T19:58:10.149370 #10296] ERROR -- lxd-ubuntu-1604: C:/opscode/chefdk/embedded/lib/ruby/2.4.0/net/protocol.rb:172:in `rbuf_fill'
E, [2017-11-24T19:58:10.149870 #10296] ERROR -- lxd-ubuntu-1604: C:/opscode/chefdk/embedded/lib/ruby/2.4.0/net/protocol.rb:154:in `readuntil'
E, [2017-11-24T19:58:10.149870 #10296] ERROR -- lxd-ubuntu-1604: C:/opscode/chefdk/embedded/lib/ruby/2.4.0/net/protocol.rb:164:in `readline'
E, [2017-11-24T19:58:10.149870 #10296] ERROR -- lxd-ubuntu-1604: C:/opscode/chefdk/embedded/lib/ruby/2.4.0/net/http/response.rb:40:in `read_status_line'
... <truncated> ...

That is the real issue at play here - the premature socket closure. It’s performing the operation correctly, but not gracefully.

And that closure is ‘probably’ the operation wait endpoint

Oh no it’s not!! I dug through the stacktrace and it is actually on the initial delete call

root@wyzsrv:/var/log/lxd/lxd-ubuntu-1604-e47973768491c51c# cat lxc.log
            lxc 20171125025809.361 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
            lxc 20171125025809.361 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
            lxc 20171125025809.460 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
            lxc 20171125025809.460 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
            lxc 20171125025809.508 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
            lxc 20171125025810.109 WARN     lxc_commands - commands.c:lxc_cmd_rsp_recv:177 - Command get_cgroup failed to receive response: Connection reset by peer.
root@wyzsrv:/var/log/lxd/lxd-ubuntu-1604-e47973768491c51c# cat lxc.conf
lxc.cap.drop = sys_time sys_module sys_rawio
lxc.mount.auto = proc:mixed sys:mixed
lxc.autodev = 1
lxc.pts = 1024
lxc.mount.entry = mqueue dev/mqueue mqueue rw,relatime,create=dir,optional
lxc.mount.entry = /dev/fuse dev/fuse none bind,create=file,optional
lxc.mount.entry = /dev/net/tun dev/net/tun none bind,create=file,optional
lxc.mount.entry = /proc/sys/fs/binfmt_misc proc/sys/fs/binfmt_misc none rbind,create=dir,optional
lxc.mount.entry = /sys/fs/fuse/connections sys/fs/fuse/connections none rbind,create=dir,optional
lxc.mount.entry = /sys/fs/pstore sys/fs/pstore none rbind,create=dir,optional
lxc.mount.entry = /sys/kernel/debug sys/kernel/debug none rbind,create=dir,optional
lxc.mount.entry = /sys/kernel/security sys/kernel/security none rbind,create=dir,optional
lxc.include = /usr/share/lxc/config/common.conf.d/
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = b *:* m
lxc.cgroup.devices.allow = c *:* m
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 1:7 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 10:229 rwm
lxc.cgroup.devices.allow = c 10:200 rwm
lxc.mount.entry = proc dev/.lxc/proc proc create=dir,optional
lxc.mount.entry = sys dev/.lxc/sys sysfs create=dir,optional
lxc.logfile = /var/log/lxd/lxd-ubuntu-1604-e47973768491c51c/lxc.log
lxc.loglevel = warn
lxc.arch = linux64
lxc.hook.pre-start = /usr/bin/lxd callhook /var/lib/lxd 184 start
lxc.hook.post-stop = /usr/bin/lxd callhook /var/lib/lxd 184 stop
lxc.tty = 0
lxc.utsname = lxd-ubuntu-1604-e47973768491c51c
lxc.mount.entry = /var/lib/lxd/devlxd dev/lxd none bind,create=dir 0 0
lxc.aa_profile = lxd-lxd-ubuntu-1604-e47973768491c51c_</var/lib/lxd>//&:lxd-lxd-ubuntu-1604-e47973768491c51c_<var-lib-lxd>:
lxc.seccomp = /var/lib/lxd/security/seccomp/lxd-ubuntu-1604-e47973768491c51c
lxc.rootfs.backend = dir
lxc.rootfs = /var/lib/lxd/containers/lxd-ubuntu-1604-e47973768491c51c/rootfs
lxc.network.0.type = veth
lxc.network.0.flags = up
lxc.network.0.link = br0
lxc.network.0.hwaddr = 00:16:3e:ae:aa:8e
lxc.network.0.name = eth0
lxc.mount.entry = /var/lib/lxd/shmounts/lxd-ubuntu-1604-e47973768491c51c dev/.lxd-mounts none bind,create=dir 0 0
root@wyzsrv:/var/log/lxd/lxd-ubuntu-1604-e47973768491c51c# ll
total 16
drwxr-xr-x 2 root root 4096 Nov 24 19:23 ./
drwxr-xr-x 6 root root 4096 Nov 25 12:59 ../
-rw-r--r-- 1 root root    0 Nov 24 19:23 forkexec.log
-rw-r--r-- 1 root root    0 Nov 24 19:21 forkstart.log
-rw-r--r-- 1 root root 2394 Nov 24 19:21 lxc.conf
-rw-r--r-- 1 root root  966 Nov 24 19:58 lxc.log
-rw-r--r-- 1 root root    0 Nov 24 19:21 lxc.log.old
root@wyzsrv:/var/log/lxd/lxd-ubuntu-1604-e47973768491c51c#

grrr… I hope you have a clue from the above logs.

no package updates today on my metal

last night as I typed these posts, I was able to make it fail on-demand, and it had been regularly failing. Today, I’ve gone through like a half dozen test cycles (converge-destroy-converge-destroy) and can’t make it fail. This is of course, ideal, lol, but if I publish this lxd driver for test-kitchen, I’d like it to be reliable.

I know it’s the weekend and a holiday, but take a peek when you get a chance. I’d appreciate your insights

P.S.
ok I can still make it fail - I just changed my usage pattern

so today, i was doing a kitchen destroy -p which deletes the containers in parallel instead of in succession. Previously I’d been omitting the ‘-p’ because destruction is quick enough. This leads me to some funny race condition on the host side.

I just now did kitchen destroy 16 = success & then kitchen destroy 14 = fail
and as usual it is properly deleted, it’s just a bad response from the rest api

P.S.S. And now even the parallel destroy is failing

I’ll get my thoughts in order and bump this into a github issue tomorrow…