Docker Swarm in LXD container


(Huepf) #1

Hi,
I am trying to run docker swarm in an LXD container. "Normal" docker is working.
After some configuration issues with the /.dockerenv file and some Kernel modules that were missing I got things running.
With docker service create --detach=false -p 80:80 nginx I can start up an NGINX service.
But there is an issue with the ingress network it seems. The docker container can't be accessed from outside.

A bit more specific: I started the NGINX container. From within the nginx container I can connect to the outside world. No problem.
But when I'm on the LXD container and try to access it via curl localhost or curl hostname I get a timeout.

The iptables rules are being updated with DOCKER-INGRESS chains. But something seems to be wrong as there is no connectivity.
Some more research showed that packets go to the nginx container and from the nginx container to the ingress-sbox but then nothing leaves the ingress-sbox. Somehow it seems to be related to the ip_vs load balancer in the ingress-sbox.
Probably a docker expert is needed here. But there is a connection to LXD as the same setup works just fine if it doesn't run in a LXD container.

The logs show nothing suspicious.
[Edit] Maybe they do. If I start another docker service I get warnings that kernel modules are missing from /lib/module. However I have them added in the docker profile and I created a file /.dockerenv which should tell docker not to look in /lib/modules for modules. Also a lsmod shows that all modules are available.
They are warnings so maybe they can be ignored?!?!

Any idea what the issue could be?


Docker Swarm in LXD container II
(Huepf) #2

We found the reason why the networking is not working.
IP forwarding needs to be enabled manually in the docker ingress-sbox namespace:
nsenter --net=/run/docker/netns/ingress_sbox sysctl -w net.ipv4.ip_forward=1

Why is that required in an LXD container and not in a "normal" environment? Any help is very much appreciated.

I also posted this issue in the docker forum as it might require some expertise on both areas.
https://forums.docker.com/t/swarm-in-lxd-issue-with-overlay-network/43021


(Stéphane Graber) #3

/proc/sys/net/ipv4/ip_forward defaults to 0 on Linux, it could be that some other service or script in your distro usually flips it to 1 somehow, but it really shouldn't be assumed to be 1...


(Huepf) #4

Well, ingress-sbox is a docker container that is managing the network traffic in the docker swarm network. So docker enables ip_forward for that container. It just doesn't do it if the host is running in LXD.


(Martin John) #5

I can’t even start a swarm service with ports mapped when under LXD (this is LXD 3.0.0 and Docker 18.03.1-ce under Ubuntu 18.04)

Normal docker containers are fine, on the LXD container
docker run -p 80:80 nginx
Creates a Docker container that I can successfully connect to from the LXD container with "curl localhost"
Starting a service without any ports mapped
docker service create --detach=false nginx
Runs fine - I can (from inside the Docker container) successfully run “curl localhost” or "curl " from the LXD container
But if I try to map a port, such as
docker service create --detach=false -p 80:80 nginx
It doesn’t start and just sits there with the following message
1/1: container ingress-sbox is already present in sandbox ingress_sbox
The suggested
nsenter --net=/run/docker/netns/ingress_sbox sysctl -w net.ipv4.ip_forward=1
doesn’t seem to make any difference.

Any thoughts?


(Huepf) #6

We run privileged LXD containers. Maybe that is the difference.


(Martin John) #7

Hmm, I tried that - but that didn’t seem to make any difference, it still complains in the same way (before and after the nsenter command)

What OS and version of LXD and Docker are you using? I’ll see if I can work out where it stops working (or what I’m doing wrong)


(Huepf) #8

The nsenter command only helps to access a service in a stack from outside. If you have nginx running in a stack and publish the port you need it to be able to access the website from outside the stack.

I’m on Ubuntu 16.04, LXD 2.21 and Docker 18.04

I’ll test it on Ubuntu 18.04 and LXD 3 later. Need to do that anyway for when we upgrade our servers.


(Martin John) #9

Thanks for that.

Ok, it must be something I’m doing wrong as I get the same under those versions.

I created a clean Ubuntu 16.04 VMware vsphere install, on that install upgraded to LXD 2.21 with

apt update
apt dist-upgrade -y
reboot
apt install zfsutils-linux
apt install -t xenial-backports lxd

Ran through lxd init

root@lxd-docker:~# lxd init
Do you want to configure a new storage pool (yes/no) [default=yes]?
Name of the new storage pool [default=default]:
Name of the storage backend to use (dir, btrfs, lvm, zfs) [default=zfs]:
Create a new ZFS pool (yes/no) [default=yes]?
Would you like to use an existing block device (yes/no) [default=no]?
Size in GB of the new loop device (1GB minimum) [default=15GB]: 4
Would you like LXD to be available over the network (yes/no) [default=no]?
Would you like stale cached images to be updated automatically (yes/no) [default=yes]?
Would you like to create a new network bridge (yes/no) [default=yes]?
What should the new bridge be called [default=lxdbr0]?
What IPv4 address should be used (CIDR subnet notation, “auto” or “none”) [default=auto]?
What IPv6 address should be used (CIDR subnet notation, “auto” or “none”) [default=auto]? none
LXD has been successfully configured.

Launched a privileged container

lxc launch ubuntu:16.04 docker -c security.nesting=true -c security.privileged=true

And in that container installed Docker 18.04, started a swarm and then tried to create the service

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) edge"
apt update
apt install -y docker-ce
docker swarm init
docker service create --detach=false -p 80:80 nginx

And once again got “container ingress-sbox is already present in sandbox ingress_sbox” error.


(Huepf) #10

Finally I was able to find the cause why docker didn’t run really smooth in our LXD containers.
The host system needs libvirt-bin and qemu-kvm installed. This resolves every issue we had.

I discovered it because docker was running smoothly on a new server that was set up freshly but still behaving on our old servers. So I compared the configs of the two which should have been the same. Turned out those two packages were missing. I assume that they are an install dependency on the latest LXD version but were not on LXD 2.x? Not sure. But happy that it’s finally running!


(Jon Clayton) #11

When I installed Rancher in LXD I could only do so on BTRFS or DIR, ZFS is really slow and unusable. Something to do with the FS that docker users, can’t remember the details. Also I noticed that later versions of docker would work, especially in 18.04. The earlier versions didn’t seem to.