Lxc ulimit stack size


(Vincent) #1

Good Morning,
I am trying to install Oracle DB 12c in a container.
Host: Ubuntu 18.04.1
Container: Guest Centos7
LXC Client version: 3.0.3
LXC Server version: 3.0.3

Oracle requires some parameters to be setup:
$lxc config set DB12C01Centos7 limits.kernel.nofile 1024:65536
$lxc config set DB12C01Centos7 limits.kernel.nproc 16384:16384
$lxc config set DB12C01Centos7 limits.kernel.stack 10240:32768
$lxc config set DB12C01Centos7 limits.kernel.memlock 31876710:31876710

I have setup the host /etc/security/limits.conf

  • soft nofile 1024
  • hard nofile 65536
  • soft nproc 16384
  • hard nproc 16384
  • soft stack 10240
  • hard stack 32768
  • soft memlock 31876710
  • hard memlock 31876710

$lxc config set DB12C01Centos7 limits.kernel.stack 10240:32768 setup prevents the container to start

$lxc start DB12C01Centos7
Error: Failed to run: /usr/lib/lxd/lxd forkstart DB12C01Centos7 /var/lib/lxd/containers /var/log/lxd/DB12C01Centos7/lxc.conf:
Try lxc info --show-log DB12C01Centos7 for more info

$ lxc info --show-log DB12C01Centos7
Name: DB12C01Centos7
Remote: unix://
Architecture: x86_64
Created: 2018/12/20 10:29 UTC
Status: Stopped
Type: persistent
Profiles: default

Log:

lxc DB12C01Centos7 20190116091919.862 ERROR conf - conf.c:run_buffer:338 - Script terminated by signal 11
lxc DB12C01Centos7 20190116091919.862 ERROR conf - conf.c:lxc_setup:3589 - Failed to run mount hooks
lxc DB12C01Centos7 20190116091919.862 ERROR start - start.c:do_start:1263 - Failed to setup container “DB12C01Centos7”
lxc DB12C01Centos7 20190116091919.862 ERROR sync - sync.c:__sync_wait:62 - An error occurred in another process (expected sequence number 5)
lxc DB12C01Centos7 20190116091919.862 WARN network - network.c:lxc_delete_network_priv:2589 - Operation not permitted - Failed to remove interface “eth0” with index 11
lxc DB12C01Centos7 20190116091919.862 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:842 - Received container state “ABORTING” instead of “RUNNING”
lxc DB12C01Centos7 20190116091919.863 ERROR start - start.c:__lxc_start:1939 - Failed to spawn container “DB12C01Centos7”
lxc 20190116091919.865 WARN commands - commands.c:lxc_cmd_rsp_recv:132 - Connection reset by peer - Failed to receive response for command “get_state”

Here the configuration profile
$ lxc config show DB12C01Centos7
architecture: x86_64
config:
image.architecture: amd64
image.description: Centos 7 amd64 (20181216_02:16)
image.os: Centos
image.release: “7”
image.serial: “20181216_02:16”
limits.kernel.memlock: 31876710:31876710
limits.kernel.nofile: 1024:65536
limits.kernel.nproc: 16384:16384
limits.kernel.stack: 10240:32768
security.privileged: “true”
volatile.base_image: dbe252826d389ca9db5be119c5ff8d52fe4e242e8e541d72abb19f35b27a05fe
volatile.eth0.hwaddr: 00:16:3e:a0:14:b2
volatile.eth0.name: eth0
volatile.idmap.base: “0”
volatile.idmap.next: ‘[]’
volatile.last_state.idmap: ‘[]’
volatile.last_state.power: STOPPED
devices:
eth0:
ipv4.address: 10.10.10.100
nictype: bridged
parent: lxdbr0
type: nic
forward10022:
connect: tcp:10.10.10.100:22
listen: tcp:0.0.0.0:10022
type: proxy
ephemeral: false
profiles:

  • default
    stateful: false
    description: “”

Thanks for your answer
Vincent


(Stéphane Graber) #2

Did you track down exactly which one of the limits is the problem?

I suspect the issue may be that the LXD daemon itself may have lower ulimits applied to it, preventing the bump, but knowing what limit is the issue and at what value the issue starts would be useful.


(Vincent) #3

Bonjour Stephane
Thanks for this very quick answer
I tried the following parameter
$lxc config set DB12C01Centos7 limits.kernel.stack 5000:5000
$ lxc config show DB12C01Centos7
architecture: x86_64
config:
image.architecture: amd64
image.description: Centos 7 amd64 (20181216_02:16)
image.os: Centos
image.release: “7”
image.serial: “20181216_02:16”
limits.kernel.memlock: 31876710:31876710
limits.kernel.nofile: 1024:65536
limits.kernel.nproc: 16384:16384
limits.kernel.stack: 5000:5000
security.privileged: “true”
volatile.base_image: dbe252826d389ca9db5be119c5ff8d52fe4e242e8e541d72abb19f35b27a05fe
volatile.eth0.hwaddr: 00:16:3e:a0:14:b2
volatile.eth0.name: eth0
volatile.idmap.base: “0”
volatile.idmap.next: ‘[]’
volatile.last_state.idmap: ‘[]’
volatile.last_state.power: STOPPED
devices:
eth0:
ipv4.address: 10.10.10.100
nictype: bridged
parent: lxdbr0
type: nic
forward10022:
connect: tcp:10.10.10.100:22
listen: tcp:0.0.0.0:10022
type: proxy
ephemeral: false
profiles:

  • default
    stateful: false
    description: “”
    vincent@Z800:~$ lxc config set DB12C01Centos7 limits.kernel.stack 5000:5000
    vincent@Z800:~$ lxc config show DB12C01Centos7
    architecture: x86_64
    config:
    image.architecture: amd64
    image.description: Centos 7 amd64 (20181216_02:16)
    image.os: Centos
    image.release: “7”
    image.serial: “20181216_02:16”
    limits.kernel.memlock: 31876710:31876710
    limits.kernel.nofile: 1024:65536
    limits.kernel.nproc: 16384:16384
    limits.kernel.stack: 5000:5000
    security.privileged: “true”
    volatile.base_image: dbe252826d389ca9db5be119c5ff8d52fe4e242e8e541d72abb19f35b27a05fe
    volatile.eth0.hwaddr: 00:16:3e:a0:14:b2
    volatile.eth0.name: eth0
    volatile.idmap.base: “0”
    volatile.idmap.next: ‘[]’
    volatile.last_state.idmap: ‘[]’
    volatile.last_state.power: STOPPED
    devices:
    eth0:
    ipv4.address: 10.10.10.100
    nictype: bridged
    parent: lxdbr0
    type: nic
    forward10022:
    connect: tcp:10.10.10.100:22
    listen: tcp:0.0.0.0:10022
    type: proxy
    ephemeral: false
    profiles:
  • default
    stateful: false
    description: “”

I get the same error
$ lxc start DB12C01Centos7
Error: Failed to run: /usr/lib/lxd/lxd forkstart DB12C01Centos7 /var/lib/lxd/containers /var/log/lxd/DB12C01Centos7/lxc.conf:
Try lxc info --show-log DB12C01Centos7 for more info

$ lxc info --show-log DB12C01Centos7
Name: DB12C01Centos7
Remote: unix://
Architecture: x86_64
Created: 2018/12/20 10:29 UTC
Status: Stopped
Type: persistent
Profiles: default

Log:

lxc DB12C01Centos7 20190116095432.476 ERROR conf - conf.c:run_buffer:338 - Script terminated by signal 11
lxc DB12C01Centos7 20190116095432.477 ERROR conf - conf.c:lxc_setup:3589 - Failed to run mount hooks
lxc DB12C01Centos7 20190116095432.477 ERROR start - start.c:do_start:1263 - Failed to setup container “DB12C01Centos7”
lxc DB12C01Centos7 20190116095432.477 ERROR sync - sync.c:__sync_wait:62 - An error occurred in another process (expected sequence number 5)
lxc DB12C01Centos7 20190116095432.478 WARN network - network.c:lxc_delete_network_priv:2589 - Operation not permitted - Failed to remove interface “eth0” with index 13
lxc DB12C01Centos7 20190116095432.479 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:842 - Received container state “ABORTING” instead of “RUNNING”
lxc DB12C01Centos7 20190116095432.485 ERROR start - start.c:__lxc_start:1939 - Failed to spawn container “DB12C01Centos7”
lxc 20190116095432.512 WARN commands - commands.c:lxc_cmd_rsp_recv:132 - Connection reset by peer - Failed to receive response for command “get_state”

The parameter which creates an issue is limits.kernel.stack. Changing the value to a lower value does not solve the issue.

All other parameters are accepted and appear properly in the container.

Thanks
Vincent


(Stéphane Graber) #4

@brauner ideas?


(Vincent) #5

Good evening
Any news?
It would be great, (if too complex to setup those parameters in the container) if LXC could inherit from host kernel parameters.
Thanks


(Stéphane Graber) #6

We were both traveling for work in South Africa so were a bit busy that week, Christian is back home now and may have a bit more time to look into this.

@brauner


(Christian Brauner) #7

where’s stack located in /proc/sys?


(Vincent) #8

Hello Christian,

I do not know where this parameter is stored in the kernel.
What I can say is:
The information is stored in
/etc/security/limits.conf or /etc/security/limits.d/limits.conf

under the format:
user soft stack 10240
user hard stack 32768

To activate those parameters
the /etc/pam.d/login needs to contain the following record
session required pam_limits.so

After to check whether the parameters have been properly updated:
$ulimit -Ss provides the soft stack: 10240
$ulimit -Hs provides the hard stack: 32768

I hope this helps
Vincent


(Christian Brauner) #9

Ah, I get it now. This is neither LXC’s nor LXD’s fault. You specify limits.kernel.stack under the assumption that this is specified in kib when it is actually specified in bytes. So you end up with a way to small stack and so LXC and any program started in there get SIGSEVed by the kernel because their crossing stack boundaries. What you want is to kib to bytes:

limits.kernel.stack: 1048576:33554432

(Vincent) #10

Thanks Christian

I will try