Building custom LXD Snap

I’ve been running into some issues with my LXD that I build from scratch. Most issues I’ve managed to solve but I’m now running into issues with Docker that I can’t seem to fix.

The main issue I’m not running LXD from the original Snap is that my Ceph version is based on 15, not 14. This is the main reason I build from the source tar when I started this cluster.

I figured what I could do is simply add the official Octopus Ceph repo to the Snap and that would be it but sadly it does not appear to be working.

I started with a minimal change on the 4.0-candidate branch.

parts:
  update-ceph:
    plugin: nil
    override-pull: |
      wget -q -O- 'https://download.ceph.com/keys/release.asc' | sudo apt-key add -
      sudo echo "deb https://download.ceph.com/debian-octopus/ bionic main" > /etc/apt/sources.list.d/ceph.list
      sudo apt update

The first issue I ran into was

make[2]: g++: Command not found
GNUmakefile:67: recipe for target ‘AParser.o’ failed
make[2]: *** [AParser.o] Error 127
make[2]: Leaving directory ‘/root/parts/edk2/build/BaseTools/Source/C/VfrCompile’
GNUmakefile:74: recipe for target ‘VfrCompile’ failed
make[1]: *** [VfrCompile] Error 2
make[1]: Leaving directory ‘/root/parts/edk2/build/BaseTools/Source/C’
GNUmakefile:19: recipe for target ‘Source/C’ failed
make: *** [Source/C] Error 2
make: Leaving directory ‘/root/parts/edk2/build/BaseTools’
Failed to run ‘override-build’: Exit code was 2.

To fix this I added build-essential to the build-packages for edk2.

The next error is

Priming update-ceph
Priming ceph
[Errno 21] Is a directory: ‘/root/stage/lib/python2.7’
We would appreciate it if you anonymously reported this issue.
No other data than the traceback and the version of snapcraft in use will be sent.
Would you like to send this error data? (Yes/No/Always/View) [no]:

Which I manage to get through (not using the word solved here) by changing the prime to /lib/python2.7*.

Now at this point I got a few warnings which I’m unsure were an issue.

Priming ceph
The ‘ceph’ part is missing libraries that are not included in the snap or base. They can be satisfied by adding the following entries to the existing stage-packages for this part:

  • librados2
  • librbd1
  • librdmacm1
    Priming criu
    Priming libco
    Priming raft
    The ‘raft’ part is missing libraries that are not included in the snap or base. They can be satisfied by adding the following entries to the existing stage-packages for this part:
  • libuv1
    Priming sqlite
    Priming dqlite
    Priming libseccomp
    Priming qemu
    The ‘qemu’ part is missing libraries that are not included in the snap or base. They can be satisfied by adding the following entries to the existing stage-packages for this part:
  • libaio1
  • librados2
  • librbd1
  • librdmacm1

It did build at this point however running it gives me some really weird errors.

root@dream:~# lxd.lxc --version
cat: /proc/self/attr/current: Permission denied
/snap/lxd/x1/commands/lxc: 6: exec: aa-exec: Permission denied

Did anybody manage to get LXD working with a custom Ceph version and has some tips on how to approach this?

When installing a locally built snap, you don’t get any of the store auto-connect assertions, this means that you at a minimum need to do:

  • snap connect lxd:lxd-support
  • snap alias lxd.lxc lxc

To get things going.

As for the build, it’s odd that you had to install anything extra. The snapcraft.yaml available from upstream is confirmed to work just fine when using snapcraft --use-lxd to build the snap.

I used the --use-lxd flag now and the build-essential requirement went away. The other changes to the snapcraft, mainly the /lib/python2.7 to /lib/python2.7* fix was still required.

I got a little further along now but the warning/errors from the build process about missing libraries still seem to apply.

root@ceph-fixes:# lxc storage create ceph ceph
Error: Failed to run: rbd --version: rbd: error while loading shared libraries: librbd.so.1: cannot open shared object file: No such file or directory

Any clue why it’s not including the libs properly?

Maybe it’s in a different location somehow?

Current logic would only work if they are in:

  • /lib/*/librbd.so.*
  • /usr/lib/*/librbd.so.*

If they are directly in /lib or /usr/lib or are in something like libexec or some other directory, then this would fail.

You were right, for some reason the libs are in different folders, changing this got me passed the weird warnings and the build process completes however there appears to be a new issue.

lxc storage create ceph-delete-me ceph
Error: Failed to run: ceph --name client.admin --cluster ceph osd pool create ceph-delete-me 32: Traceback (most recent call last):
  File "/snap/lxd/x5/bin/ceph", line 140, in <module>
    import rados
ModuleNotFoundError: No module named 'rados'

Now I’m assuming this is related to the snapcraft crash I got on the python2.7 directive. I tried to include other Python versions as well.

      - lib/python2.7/*
      - lib/python3/*
      - lib/python3.6/*

But none fix the issue. Any last minute things you could think of before I throw my laptop out and start my life as a potato farmer? :wink:

:slight_smile:

Usually what I do is look at the stage directory in the snapcraft-lxd container.
This shows all the bits that were built and where they are. Anything you see there that you need, you’ll have to add to prime.

Indeed with newer Ceph you should be able to shed all the python2 bits as I believe they finally transitioned to python3. Extra benefit from that is that base python3 is included in the core18 snap, so you won’t have to include the interpreter itself in the snap, letting you save quite a bit of space by only shipping the needed modules and libraries.

Sadly I can’t seem to solve this.

If there is any other LXD user out there that is up for the challenge I will put a 200 USD reward on a working snap of LXD using the latest Ceph.