asad
(Asad )
July 15, 2020, 1:34pm
1
I am having a problem when launching more than 40 containers on a single server it gives DISK I/O error ,
Error: failed to add “Stopping container” Operation a92c002f-9117-4c0c-9775-5e8e074bbcbf to database: disk I/O error
My server on which i am launching containers have 8 cpu cores , 32GB ram and 500GB hardisk.
and when i restart snap service its again start creating container.but its launching time gets decreased.
can you please help out what is the real problem ?
stgraber
(Stéphane Graber)
July 15, 2020, 1:51pm
2
I’d look at dmesg
for disk errors as a starting point.
You should also look at the production setup page in the documentation, there are some system limits that can be raised a bit and may help on busy systems.
asad
(Asad )
July 15, 2020, 2:50pm
3
i didn’t found anything in production setup page regarding disk I/O error , furthermore i am also getting following error and can you please elaborate what type of system limits you are saying that should be raised ?
[10157.668170] Task in /lxc.payload.pxgmxzqc killed as a result of limit of /lxc.payload.pxgmxzqc
[10157.681441] memory: usage 499984kB, limit 500000kB, failcnt 42564
[10157.687644] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
[10157.695746] kmem: usage 8284kB, limit 9007199254740988kB, failcnt 0
[10157.702121] Memory cgroup stats for /lxc.payload.pxgmxzqc: cache:19364KB rss:472336KB rss_huge:6144KB mapped_file:14352KB dirty:0KB writeback:0KB inactive_anon:19232KB active_anon:472416KB inactive_file:0KB active_file:0KB unevictable:0KB
[10157.734647] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[10157.743847] [22199] 1000000 22199 34710 348 34 3 0 0 systemd
[10157.753736] [22279] 1000000 22279 11523 1069 25 3 0 0 systemd-journal
[10157.764870] [22313] 1000105 22313 11278 114 26 3 0 0 dbus-daemon
[10157.775673] [22319] 1000000 22319 9495 115 24 3 0 0 systemd-logind
[10157.786732] [22320] 1000000 22320 7416 72 19 3 0 0 cron
[10157.796939] [22386] 1000000 22386 5089 261 12 3 0 0 dhclient
[10157.807454] [22401] 1000107 22401 74160 1609 100 4 0 0 freshclam
[10157.818079] [22409] 1000000 22409 3575 33 12 3 0 0 agetty
[10157.828444] [22698] 1000106 22698 14070 191 28 3 0 0 exim4
[10157.838720] [27691] 1000000 27691 92464 4033 143 3 0 0 php-fpm7.0
[10157.849432] [27692] 1000033 27692 92560 2022 131 3 0 0 php-fpm7.0
[10157.860148] [27693] 1000033 27693 92581 2598 132 3 0 0 php-fpm7.0
[10157.870853] [27714] 1000000 27714 81710 1281 76 4 0 0 nginx
[10157.881124] [27715] 1000033 27715 81710 1385 78 4 0 0 nginx
[10157.891398] [27716] 1000033 27716 81710 1333 76 4 0 0 nginx
[10157.901996] [20277] 1000107 20277 192193 111620 299 4 0 0 freshclam
[10157.912318] Memory cgroup out of memory: Kill process 20277 (freshclam) score 895 or sacrifice child
[10157.922919] Killed process 20277 (freshclam) total-vm:768772kB, anon-rss:446480kB, file-rss:0kB, shmem-rss:0kB
[10157.956485] oom_reaper: reaped process 20277 (freshclam), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[10281.995622] php-fpm7.0 invoked oom-killer: gfp_mask=0x24200ca(GFP_HIGHUSER_MOVABLE), nodemask=0, order=0, oom_score_adj=0
[10282.006715] php-fpm7.0 cpuset=lxc.payload.wiqhbwvm mems_allowed=0
[10282.013147] CPU: 3 PID: 3813 Comm: php-fpm7.0 Tainted: P O 4.9.0-12-amd64 #1 Debian 4.9.210-1+deb9u1
[10282.023528] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[10282.032845] 0000000000000000 ffffffff84536f6e ffffc0061841bdd8 ffff9d8a21afa380
[10282.040797] ffffffff84409beb 0000000000000000 0000000000000000 ffff9d896d04c000
[10282.050133] 0000000000000000 ffffffff850f57b0 ffffffff8440248c ffff9d8e7fffbb80
[10282.058079] Call Trace:
[10282.060631] [] ? dump_stack+0x66/0x88
[10282.067421] [] ? dump_header+0x78/0x1fd
[10282.073004] [] ? mem_cgroup_scan_tasks+0xcc/0x100
[10282.080842] [] ? oom_kill_process+0x22a/0x3f0
[10282.086948] [] ? out_of_memory+0x111/0x470
[10282.094176] [] ? mem_cgroup_out_of_memory+0x49/0x80
[10282.100803] [] ? mem_cgroup_oom_synchronize+0x325/0x340
[10282.109164] [] ? mem_cgroup_css_reset+0xd0/0xd0
[10282.115444] [] ? pagefault_out_of_memory+0x2f/0x80
[10282.123369] [] ? __do_page_fault+0x4bd/0x4f0
[10282.129387] [] ? page_fault+0x28/0x30
stgraber
(Stéphane Graber)
July 15, 2020, 2:57pm
4
Ah, well, in your case, you’re running out of memory by the looks of it.
The limit I had in mind is fs.aio-max-nr
which can cause errors like the one you had, though you should also look at why your system is running out of memory.
asad
(Asad )
July 15, 2020, 3:28pm
5
This limit “fs.aio-max-nr” we have to set on Host or in container ?
asad
(Asad )
July 15, 2020, 4:00pm
7
My host RAM (Memory ) is 32GB , still i need to set this limit on host ?
asad
(Asad )
July 15, 2020, 4:55pm
8
@stgraber Thanks a lot for prompt response.
The limit you have asked to set on host , i have set that limit and it seems to be working fine for now , but i will test it further and will update you soon .
last thing you also said that my system is running out of memory , so the memory of my host was running out or the memory of my container ??