Container deleted or stopped when lxc-ls executed concurrently

Hi,

I’m creating three LXC containers (v3.0.1) from a process and installing some RPM based packages on them.
So basically three steps:

  1. Create containter (Shared host namespace - lxc.net.0.type = none) and start
  2. Copy packages to container
  3. Install packages

I’ve observed that if “lxc-ls” is executed when this process is doing its job, one of the container either stops or disappear after creating successfully.

There is no messages in the dmesg either to suggest that what might be going on.

Any idea where the problem could be ?

Regards,
Dinesh

When exactly are you observing this during the creation phase?

Ok, I think I see what’s happening.

Its happening when containers are being created and the command (lx-ls or lxc-info) is executed simultaneously. Its kind of 100% reproducible.

Yes, I know the cause and I have sent a PR https://github.com/lxc/lxc/pull/2526 that fixes this. The problem is that a while back we switched to OFD locks for thread-safety reasons. The kernel does a few things differently with those locks. One this is that it doesn’t want the l_pid field of the lock struct to be set to anything else than 0 and also that it initializes the l_pid field to -1. I didn’t account for that before. With that fix your error is not reproducible.

Great! … Thanks a lot, Christian. Appreciate the quick RCA and fix.

Regards,

Dinesh