Lxc list Disk IO

Hello,

The SSD which was holding my container racked up lot of disk usage and is showing “Percent_Lifetime_Remain:FAILINg_NOW”. Is it possible to identify which container is causing high disk usage?

# zpool status ssdpool1
  pool: ssdpool1
 state: ONLINE
  scan: scrub repaired 0B in 0 days 00:12:55 with 0 errors on Fri Dec  4 19:42:56 2020
config:

        NAME                                               STATE     READ WRITE CKSUM
        ssdpool1                                           ONLINE       0     0     0
          mirror-0                                         ONLINE       0     0     0
            ata-Crucial_CT525MX300SSD1_1643146640C9-part1  ONLINE       0     0     0
            ata-CT500MX500SSD1_1904E1E531BD-part1          ONLINE       0     0     0

errors: No known data errors

First disk

smartctl -a /dev/disk/by-id/ata-CT500MX500SSD1_1904E1E531BD
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   000    Pre-fail  Always       -       0
  5 Reallocate_NAND_Blk_Cnt 0x0032   100   100   010    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       9420
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       207
171 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 Ave_Block-Erase_Count   0x0032   001   001   000    Old_age   Always       -       1485
174 Unexpect_Power_Loss_Ct  0x0032   100   100   000    Old_age   Always       -       179
180 Unused_Reserve_NAND_Blk 0x0033   000   000   000    Pre-fail  Always       -       43
183 SATA_Interfac_Downshift 0x0032   100   100   000    Old_age   Always       -       0
184 Error_Correction_Count  0x0032   100   100   000    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   060   042   000    Old_age   Always       -       40 (Min/Max 0/58)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Bogus_Current_Pend_Sect 0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
202 Percent_Lifetime_Remain 0x0030   001   001   001    Old_age   Offline  FAILING_NOW 99
206 Write_Error_Rate        0x000e   100   100   000    Old_age   Always       -       0
210 Success_RAIN_Recov_Cnt  0x0032   100   100   000    Old_age   Always       -       0
246 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       85294967330
247 Host_Program_Page_Count 0x0032   100   100   000    Old_age   Always       -       1535743237
248 FTL_Program_Page_Count  0x0032   100   100   000    Old_age   Always       -       25793616794

Second disk

smartctl -a /dev/disk/by-id/ata-Crucial_CT525MX300SSD1_1643146640C9
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   000    Pre-fail  Always       -       0
  5 Reallocate_NAND_Blk_Cnt 0x0032   100   100   010    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       29250
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       629
171 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 Ave_Block-Erase_Count   0x0032   071   071   000    Old_age   Always       -       444
174 Unexpect_Power_Loss_Ct  0x0032   100   100   000    Old_age   Always       -       263
183 SATA_Interfac_Downshift 0x0032   100   100   000    Old_age   Always       -       0
184 Error_Correction_Count  0x0032   100   100   000    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   066   053   000    Old_age   Always       -       34 (Min/Max 21/47)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
202 Percent_Lifetime_Remain 0x0030   071   071   001    Old_age   Offline      -       29
206 Write_Error_Rate        0x000e   100   100   000    Old_age   Always       -       0
246 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       103362358015
247 Host_Program_Page_Count 0x0032   100   100   000    Old_age   Always       -       3241509061
248 FTL_Program_Page_Count  0x0032   100   100   000    Old_age   Always       -       8624516660
180 Unused_Reserve_NAND_Blk 0x0033   000   000   000    Pre-fail  Always       -       1933
210 Success_RAIN_Recov_Cnt  0x0032   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

Thanks

You could run something like iotop and look at the individual I/O heavy processes.

Thanks i ran iotop for sometime. It seems systemd-journal files are causing the writes, as well mysql. Is there any way to identify the container for these systemd-journald and disable the logging?

#iotop -o -P -a
   PID  PRIO USER       DISK READ  DISK WRITE SWAPIN   IO>    COMMAND                                                                                                                                     
 29398 be/4 1000000       0.00 B      7.28 M  0.00 %  0.00 % systemd-journald
  21530 be/4 usbmux        4.75 M      7.12 M  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --~mysql --pid-file=/run/mysqld/mysqld.pid
  21167 be/4 1000000       0.00 B      6.83 M  0.00 %  0.00 % systemd-journald
3912744 be/4 root        748.00 K      5.07 M  0.00 %  0.03 % [kworker/u98:14-kcryptd/254:3]
3679976 be/4 root        624.00 K      3.79 M  0.00 %  0.04 % [kworker/u98:2-kcryptd/254:10]
1743571 be/4 1000000       0.00 B      3.18 M  0.00 %  0.00 % systemd-journald
3912743 be/4 root        248.00 K      2.05 M  0.00 %  0.01 % [kworker/u98:10-kcryptd/254:9]
3991211 be/4 root        272.00 K      2.01 M  0.00 %  0.01 % [kworker/u98:11-kcryptd/254:8]
3991209 be/4 root        360.00 K   1936.00 K  0.00 %  0.04 % [kworker/u98:4-kcryptd/254:10]
   7878 be/4 root          0.00 B   1744.00 K  0.00 %  0.00 % systemd-journald
  34640 be/4 1000000       0.00 B   1728.00 K  0.00 %  0.00 % systemd-journald
3799867 be/4 root        232.00 K   1416.00 K  0.00 %  0.01 % [kworker/u97:6-kcryptd/254:8]
3679978 be/4 root        272.00 K   1308.00 K  0.00 %  0.01 % [kworker/u98:7-kcryptd/254:9]
3991210 be/4 root        116.00 K   1148.00 K  0.00 %  0.00 % [kworker/u98:6-kcryptd/254:5]

If you run ps fauxwww you’ll see a process tree which leads back to the container name.

I used

#grep -rn 29398 /sys/fs/cgroup/pids/*
/sys/fs/cgroup/pids/lxc.payload.proxy/system.slice/systemd-journald.service/cgroup.procs:1:29398
/sys/fs/cgroup/pids/lxc.payload.proxy/system.slice/systemd-journald.service/tasks:1:29398

Looked into the proxy container (running nginx) and see this,

# journalctl -r
Dec 10 23:44:27 proxy agetty[783422]: /dev/tty1: cannot open as standard input: No such file or directory
Dec 10 23:44:27 proxy agetty[783418]: /dev/lxc/tty3: cannot open as standard input: No such file or directory
Dec 10 23:44:27 proxy agetty[783419]: /dev/lxc/tty4: cannot open as standard input: No such file or directory
Dec 10 23:44:27 proxy agetty[783421]: /dev/lxc/tty6: cannot open as standard input: No such file or directory
Dec 10 23:44:27 proxy agetty[783416]: /dev/lxc/tty1: cannot open as standard input: No such file or directory
Dec 10 23:44:27 proxy agetty[783420]: /dev/lxc/tty5: cannot open as standard input: No such file or directory
Dec 10 23:44:27 proxy agetty[783417]: /dev/lxc/tty2: cannot open as standard input: No such file or directory
Dec 10 23:44:27 proxy systemd[1]: Started Getty on tty1.
Dec 10 23:44:27 proxy systemd[1]: Stopped Getty on tty1.
Dec 10 23:44:27 proxy systemd[1]: Started Getty on lxc/tty6.
Dec 10 23:44:27 proxy systemd[1]: Stopped Getty on lxc/tty6.
Dec 10 23:44:27 proxy systemd[1]: Started Getty on lxc/tty5.
Dec 10 23:44:27 proxy systemd[1]: Stopped Getty on lxc/tty5.
Dec 10 23:44:27 proxy systemd[1]: Started Getty on lxc/tty4.
Dec 10 23:44:27 proxy systemd[1]: Stopped Getty on lxc/tty4.
Dec 10 23:44:27 proxy systemd[1]: Started Getty on lxc/tty3.
Dec 10 23:44:27 proxy systemd[1]: Stopped Getty on lxc/tty3.
Dec 10 23:44:27 proxy systemd[1]: Started Getty on lxc/tty2.
Dec 10 23:44:27 proxy systemd[1]: Stopped Getty on lxc/tty2.
Dec 10 23:44:27 proxy systemd[1]: Started Getty on lxc/tty1.
Dec 10 23:44:27 proxy systemd[1]: Stopped Getty on lxc/tty1.
Dec 10 23:44:27 proxy systemd[1]: getty@tty1.service: Scheduled restart job, restart counter is at 111288.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty6.service: Scheduled restart job, restart counter is at 111299.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty5.service: Scheduled restart job, restart counter is at 111288.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty4.service: Scheduled restart job, restart counter is at 111292.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty3.service: Scheduled restart job, restart counter is at 111289.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty1.service: Scheduled restart job, restart counter is at 111287.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty2.service: Scheduled restart job, restart counter is at 111288.
Dec 10 23:44:27 proxy systemd[1]: getty@tty1.service: Succeeded.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty6.service: Succeeded.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty5.service: Succeeded.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty4.service: Succeeded.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty3.service: Succeeded.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty1.service: Succeeded.
Dec 10 23:44:27 proxy systemd[1]: getty@lxc-tty2.service: Succeeded.
Dec 10 23:44:17 proxy agetty[783414]: /dev/tty1: cannot open as standard input: No such file or directory
Dec 10 23:44:17 proxy systemd[1]: Started Getty on tty1.
Dec 10 23:44:17 proxy systemd[1]: Stopped Getty on tty1.
Dec 10 23:44:17 proxy systemd[1]: getty@tty1.service: Scheduled restart job, restart counter is at 111287.
Dec 10 23:44:17 proxy systemd[1]: getty@tty1.service: Succeeded.
Dec 10 23:44:17 proxy agetty[783408]: /dev/lxc/tty1: cannot open as standard input: No such file or directory
Dec 10 23:44:17 proxy agetty[783411]: /dev/lxc/tty4: cannot open as standard input: No such file or directory
Dec 10 23:44:17 proxy agetty[783410]: /dev/lxc/tty3: cannot open as standard input: No such file or directory
Dec 10 23:44:17 proxy agetty[783409]: /dev/lxc/tty2: cannot open as standard input: No such file or directory
Dec 10 23:44:17 proxy agetty[783412]: /dev/lxc/tty5: cannot open as standard input: No such file or directory
Dec 10 23:44:17 proxy agetty[783413]: /dev/lxc/tty6: cannot open as standard input: No such file or directory

Why is systemd going crazy? It is an archlinux container.

Looks like a LXC container that got converted to LXD maybe?

LXD doesn’t have /dev/ttyX so that may be the issue, look for those getty@lxc-* jobs and disable them, that should help.

I followed this https://github.com/lxc/lxc/issues/2638#issuecomment-501230249

systemctl disable --now getty@lxc-tty1
systemctl disable --now getty@lxc-tty2
systemctl disable --now getty@lxc-tty3
systemctl disable --now getty@lxc-tty4
systemctl disable --now getty@lxc-tty5
systemctl disable --now getty@lxc-tty6

This got rid of the tty messages except one,

Dec 10 23:53:57 proxy agetty[783857]: /dev/tty1: cannot open as standard input: No such file or directory
Dec 10 23:53:57 proxy systemd[1]: Started Getty on tty1.
Dec 10 23:53:57 proxy systemd[1]: Stopped Getty on tty1.
Dec 10 23:53:57 proxy systemd[1]: getty@tty1.service: Scheduled restart job, restart counter is at 111344.
Dec 10 23:53:57 proxy systemd[1]: getty@tty1.service: Succeeded.
Dec 10 23:53:47 proxy agetty[783856]: /dev/tty1: cannot open as standard input: No such file or directory
Dec 10 23:53:47 proxy systemd[1]: Started Getty on tty1.
Dec 10 23:53:47 proxy systemd[1]: Stopped Getty on tty1.
Dec 10 23:53:47 proxy systemd[1]: getty@tty1.service: Scheduled restart job, restart counter is at 111343.
Dec 10 23:53:47 proxy systemd[1]: getty@tty1.service: Succeeded.

How do i reduce the spam?

Never used LXC here, it was an LXD container to begin with. /dev/tty1 is still spamming as the log above shows.

Finally after this the spamming stopped,

systemctl disable getty@tty1.service
1 Like