/proc files show host values via API/cli and /proc/diskstats is blank on ZFS

noah · April 13, 2020, 1:48pm

Two questions on /proc files to see if this is expected behavior or bug (hopefully not my fault again like last time *knock on wood* )

1) /proc files show different values when read via API / lxc client vs inside container:
When viewing /proc/* files from inside the container (i.e. lxc exec c2 – bash) the values of /proc/meminfo and /proc/stat show the correct limits that were set when the container was created.

But when reading the /proc/* files from lxc file pull c2/proc/stat . or /1.0/instances/c2/files?path=/proc/stat, the contents of the files show the host’s info.

My assumption is that the lxcfs /proc overlay isn’t applied when read via api/lxc client? Is this expected behavior or should this report the container’s /proc values?

My goal is to read the /proc files as the container sees them from the host without having to run a progam inside of the container. Is that possible?

Output for /proc/meminfo and /proc/stat via client and inside container:

# Shows host total (4gb)
lxc file pull c2/proc/meminfo . (or api /1.0/instances/c2/file?path=/proc/meminfo)

MemTotal:        4038872 kB
MemFree:         1816836 kB
MemAvailable:    3051636 kB
...

# shows container total (512mb)
root@c2:~# cat /proc/meminfo
MemTotal:         500000 kB
MemFree:          137676 kB
MemAvailable:     446764 kB
...

# host has 4 cores
lxc file pull c2/proc/stat .
cpu  56365 0 2148 3535408 0 0 0 0 0 0
cpu0 20749 0 463 876148 0 0 0 0 0 0
cpu1 22001 0 378 876856 0 0 0 0 0 0
cpu2 6624 0 454 892603 0 0 0 0 0 0
cpu3 6991 0 853 889801 0 0 0 0 0 0
intr 3881954 42 9 0 0 859...

# 2 allocated on container
root@c2:~# cat /proc/stat
cpu  416 0 0 1808832 0 0 0 0 0 0
cpu0 395 0 0 903281 0 0 0 0 0 0
cpu1 21 0 0 905551 0 0 0 0 0 0
intr 3903315 42 9 0 0 859...

2) Empty /proc/diskstats on ZFS containers
Containers using ZFS have blank /proc/diskstats. However, directory storage based containers show stats in /proc/diskstats - Is this a ZFS limitation? If so, is there any way to get disk stats (ops & bytes read/written) for containers on ZFS?

An OpenZFS contributor pointed me to this PR that adds per dataset kstats (io ops & bytes r/w) for ZFS: https://github.com/openzfs/zfs/pull/7705. Could this be useable for implementing /proc/diskstats for ZFS backed containers?

Container profile & info:
Host/Container OS: Ubuntu 18.04
Snap: lxd 4.0.0 14503 latest/stable

Disclaimer: This a dev environment which is a VirtualBox Ubuntu host (running on macOS). Adding just in case this could be causing any weirdness.

architecture: x86_64
config:
  image.architecture: amd64
  image.description: Ubuntu bionic amd64 (20200402_07:42)
  image.os: Ubuntu
  image.release: bionic
  image.serial: "20200402_07:42"
  image.type: squashfs
  limits.cpu: "2"
  limits.memory: 512MB
  security.devlxd: "false"
  security.idmap.isolated: "true"
  security.nesting: "false"
  security.privileged: "false"
  user.network-config: |
    #cloud-config
    version: 1
    config:
    - type: physical
      name: eth0
      subnets:
        - type: static
          address: 192.168.1.218/32
          gateway: 192.168.1.254
          dns_nameservers:
            - 1.1.1.1
            - 8.8.8.8
  user.user-data: |
    #cloud-config
    preserve_hostname: false
    hostname: c2
  volatile.base_image: 5f6884b0ebbbf559d03390354cb262e3908e2af2e27362e9ddb9805925f017d3
  volatile.eth0.host_name: p4
  volatile.idmap.base: "1458752"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1458752,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":1458752,"Nsid":0,"Maprange":65536}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1458752,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":1458752,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1458752,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":1458752,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.power: RUNNING
devices:
  eth0:
    host_name: cport1
    hwaddr: 00:16:3e:97:bc:42
    name: eth0
    nictype: bridged
    parent: br0
    security.mac_filtering: "true"
    type: nic
  root:
    path: /
    pool: pool1
    size: 3GB
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

Possible relevant log info from LXD:

Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: ==> Setting up ZFS (0.7)
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: ==> Escaping the systemd cgroups
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: ====> Detected cgroup V1
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: ==> Escaping the systemd process resource limits
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: ==> Increasing the number of inotify user instances
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: ==> Disabling shiftfs on this kernel (auto)
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: => Starting LXCFS
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: Running constructor lxcfs_init to reload liblxcfs
Apr 11 14:44:26 ubuntu-bionic kernel: [   41.634985] new mount options do not match the existing superblock, will be ignored
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: mount namespace: 4
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: hierarchies:
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:   0: fd:   5:
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:   1: fd:   6: name=systemd
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:   2: fd:   7: freezer
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:   3: fd:   8: cpuset
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:   4: fd:   9: memory
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:   5: fd:  10: rdma
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:   6: fd:  11: perf_event
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:   7: fd:  12: pids
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:   8: fd:  13: cpu,cpuacct
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:   9: fd:  14: devices
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:  10: fd:  15: net_cls,net_prio
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:  11: fd:  16: blkio
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]:  12: fd:  17: hugetlb
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: api_extensions:
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - cgroups
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - sys_cpu_online
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - proc_cpuinfo
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - proc_diskstats
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - proc_loadavg
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - proc_meminfo
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - proc_stat
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - proc_swaps
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - proc_uptime
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - shared_pidns
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - cpuview_daemon
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - loadavg_daemon
Apr 11 14:44:26 ubuntu-bionic lxd.daemon[1541]: - pidfds

stgraber · April 13, 2020, 10:30pm

lxcfs shows the value relevant to the process accessing the file, normally you access the files from a process within the container so see the container values.

When running lxc file pull, it’s instead LXD on the host (outside of the container) reading the file, so you see the host values. That’s normal.

stgraber · April 13, 2020, 10:31pm

We have some of the memory/cpu information in lxc info which may be sufficient to avoid attaching to the container.

Otherwise, just use lxc exec. lxc file pull does 90% of what lxc exec does, so you’re not saving much by using it and if reading multiple files, it may actually be worse.

noah · April 13, 2020, 11:36pm

@stgraber that makes sense, thank you for the response!

Regarding the blank /proc/diskstats on ZFS backed containers (viewing that file inside the container via lxc exec smtp-inbound01 -- bash):

Is that due to how ZFS handles I/O and not readable by lxcfs? If that’s the case, could the OpenZFS PR referenced in the top post be a possible solution to reading per dataset stats or too outside the scope of what lxcfs wants to touch? https://github.com/openzfs/zfs/pull/7705

# Container using directory storage
lxc exec c2 -- bash
root@c2:~# cat /proc/diskstats 
8       0 sda 3315 881 119112 56918 66 1 680 10042 0 17845 0

# Container using ZFS pool
lxc exec smtp-inbound01 -- bash
root@smtp-inbound01:~# cat /proc/diskstats 
root@smtp-inbound01:~#

stgraber · April 14, 2020, 2:34am

Can you compare the content of /sys/fs/cgroup/blkio in both containers?

I suspect we have device stats recorded in the dir one but not in the zfs one.
That’s where lxcfs gets its data.

noah · April 14, 2020, 2:13pm

Ah yeah, it appears they are empty on the ZFS container.

# ZFS container
root@smtp-inbound01:/sys/fs/cgroup/blkio# cat blkio.io_service_bytes
Total 0
root@smtp-inbound01:/sys/fs/cgroup/blkio# cat blkio.io_serviced
Total 0
root@smtp-inbound01:/sys/fs/cgroup/blkio# cat blkio.io_service_time
Total 0
root@smtp-inbound01:/sys/fs/cgroup/blkio#

# Directory container
root@c2:/sys/fs/cgroup/blkio# cat blkio.io_service_bytes
8:0 Read 21024768
8:0 Write 0
8:0 Sync 21024768
8:0 Async 0
8:0 Total 21024768
Total 21024768
root@c2:/sys/fs/cgroup/blkio# cat blkio.io_serviced
8:0 Read 1265
8:0 Write 0
8:0 Sync 1265
8:0 Async 0
8:0 Total 1265
Total 1265
root@c2:/sys/fs/cgroup/blkio# cat blkio.io_service_time
8:0 Read 5864980740
8:0 Write 0
8:0 Sync 5864980740
8:0 Async 0
8:0 Total 5864980740
Total 5864980740
root@c2:/sys/fs/cgroup/blkio#