Running integration tests

(Igor Galić) #1

Hi folks

we’re trying to work on a feature in LXD and as a first step have attempted to replicate a continuous test environment.

However, when it comes to unit tests, we had to run them in single-process mode: and integration tests are failing us in two different ways as well:

when running them by hand, remote_usage fails:

EROR[03-13|20:46:14] Failed to retrieve PID of executing child process: EOF 

when running them in gitlab-ci, basic_usage fails:

Starting c1
action=start created=2018-03-13T20:33:45+0000 ephemeral=false lvl=eror msg="Failed starting container" name=c1 stateful=false t=2018-03-13T20:33:45+0000 used=1970-01-01T00:00:00+0000
error: Failed to run: /root/go/bin/lxd forkstart c1 /root/go/src/ /root/go/src/ 
Try `lxc info --show-log lxd2:c1` for more info
==> Cleaning up

any ideas how to get this sorted out?

(Stéphane Graber) #2

The start error is usually because liblxc can’t traverse to the container’s path.
You could try this to fix it:

chmod +x /root /root/go

Assuming you have the source code under /root/go/src and don’t have a weird umask, this should fix the isssue.

(Igor Galić) #3

what’s really interesting here is that the GOPATH seems to be /root/go, despite our setting of /srv/go in /etc/environment

perhaps we should set it explicitly in our .gitlab-ci.yml

(Igor Galić) #4

after fixing the GOPATH to /srv/go in .gitlab-ci, there’s still something struggling against it in the tests…
so we’re now passing it to make GOPATH=$GOPATH check as well…

if this fails, we should probably revert to simply doing chmod +x /root /root/go (in the morning)

(Igor Galić) #5

okay, this is still happening, despite the GOPATH being in /root/go and the permissions being correctly set:

root@gitlab-ci-shell-runner-root:~# namei -om /root/go/src/
f: /root/go/src/
 drwxr-xr-x root root /
 drwxr-xr-x root root root
 drwxr-xr-x root root go
 drwxr-xr-x root root src
 drwxr-xr-x root root
 drwxr-xr-x root root lxc
 drwxr-xr-x root root lxd
 drwxr-xr-x root root test

we’re still getting,

Creating last-used-at-test

167: eth0@if168: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue qlen 1000
lvl=eror msg="Failed to retrieve PID of executing child process: EOF" t=2018-03-14T09:57:21+0000
Creating deleterunning

Starting deleterunning
{"error":"container is running","error_code":400,"type":"error"}
Creating lxd-apparmor-test

Starting lxd-apparmor-test
==> Cleaning up
==> Killing LXD at /root/go/src/
==> Deleting all containers
==> Deleting all images
==> Deleting all networks
==> Deleting all profiles
Profile default deleted
==> Deleting all storage pools
Storage pool lxdtest-3tj deleted
==> Checking for locked DB tables
==> Checking for leftover files
==> Checking for leftover cluster DB entries
==> Tearing down directory backend in /root/go/src/

==> TEST DONE: basic usage
==> Test result: failure

(Stéphane Graber) #6

That looks different though, this is failing at the lxd-apparmor-test now which suggests some apparmor confinement test is failing on your system.

What kernel are you using and what distro is it?

(Igor Galić) #7

indeed! my reading was wrong here.

I’ve now modified the cleanup() function to simply exit 1 so i can post full logs

we’re running Ubuntu 16.04.4 LTS with kernel 4.4.0-116-generic.
here’s a gist of the lxd.log

(Stéphane Graber) #8

You could run the testsuite with LXD_VERBOSE=1 set in the environment, it’d show you exactly what part of the apparmor test is failing.

Are you running the testsuite inside a container? If so, that’d explain the failure and also why you’re likely to run into a bunch more problems. The LXD testsuite is meant to run as root on a host, not inside a container.

(Igor Galić) #9


problem solved

i’ve now compiled our lxc with libapparmor-dev libseccomp-dev libcap-dev installed, and now the integration tests are actually passing.

(Stéphane Graber) #10

Haha, yeah, that’d explain it I guess :slight_smile:

(Igor Galić) #11

no containers are involved (yet … and then it’ll be lxd running them, not lxd running inside of them)

and running make check with LXD_VERBOSE=1 might be… a good tip… for later!

(Stéphane Graber) #12

@igalic I don’t suppose that helped with that clustering/raft issue? :slight_smile:

I’m still confused by that one as I can’t seem to reproduce it here.

(Igor Galić) #13

you mean the TestHeartbeat failing unless ran with GOMAXPROCS=1
i can test that again for you and will report back (on the github issue, if it’s related to this!)

(Igor Galić) #14

@stgraber your hunch was right! I’m gonna close the bug… after submitting a PR to the docs :wink: