LXDRunner: Emphemeral self-hosted runners for GitHub Actions and LXD

Jonan · April 20, 2021, 2:59am

Announcing this here as it may be of interest to GitHub/LXD users.

LXDRunner is an experimental server to automatically dispatch self-hosted runners for Github Actions using LXD.

For those not familiar Github Actions is their own CI/CD service which is either hosted by Github or self-hosted using their runner client. This hooks LXD into that, providing pristine ephemeral workers for every job without intervention.

Its a bit rough but already usable and making good use of it running matrix builds in containers and VMs. There are various issues to resolve, check the TODO list if interested.

stgraber · April 20, 2021, 3:11am

Nice!

I’ve been looking into this a bit in the past and considered any real work on this to be effectively blocked by https://github.com/actions/runner/issues/510 as without a reliable way to have the runner consume only a single job, we’d run the risk of a second job being partly consumed by the runner causing random failures.

Isn’t that also a problem with your approach?

Jonan · April 20, 2021, 3:57am

Probably as Im also using the --once flag and came across that ticket. But I have yet to confirm it as the majority of my failures have been due provisioning (shell setup script, error handling etc). According to issue they are working on it so I am hopeful it will be fixed at some point.

Guess I should come up a with a test case to reproduce it, see how it works and see if it can be worked around. A large queue of jobs with max_workers=1 should do it.

There are some k8s, and docker projects also using the --once flag so need to investigate.

The runner client is also open so might be worth poking around to see if anything could be done from that end:

erik_lonroth · October 3, 2023, 7:13am

Did you continue on this software? I’m thinking about creating a juju charm for deploying a github runner

stgraber · October 3, 2023, 2:25pm

lxdrunner hasn’t moved much in the past few years but there is GitHub - cloudbase/garm: GitHub Actions Runners Manager that can handle that kind of thing.

I also have my own implementation of a much more limited one that I’ve been using for my package builds, but I’ve not yet had the time to clean it up and push it somewhere, also needs some stability fixes first.

erik_lonroth · October 3, 2023, 2:40pm

I’m quite interested how we would setup that kind of thing up for building some large repos which takes forever on the github-free-runners which we would like to use for “snap” building. We have already looked to your “lxd-snap” for inspiration, so we’d be happy to get advice on how to setup our own local runners on lxd/incus.

So far, we just end up building with “snapcraft --destructive-mode” which is kind of defeating the purpose of snap building I suppose…

erik_lonroth · October 5, 2023, 1:11pm

@stgraber would you be able to let us in on your thoughts about the snap building using lxd? snapcraft is spawning new containers, which I guess we would somehow need to integrate with github action-runners etc. which we don’t have a good picture as how to achieve. We guess it would involve calling a lxd-remote somehow, but we’re not sure about how to properly architect it.

Gabriel-Adrian_Samfi · January 25, 2024, 8:44am

Hi,

I am one of the authors of GARM that @stgraber linked to. I know this is an older post, but it might still be useful to add some info.

That project was initially created to help the Flatcar project integrate their CI workflow into github actions. They needed a way to build the entire distro and then run integration tests against the resulting images. As you can imagine, the default runners, which have only 2 cores would take a really long time.

The initial release had only LXD support, but has since evolved to support an array of providers (Incus, LXD, OpenStack, Azure, Amazon EC2, k8s, more on the way).

For the flatcar project there are a few bare metal servers set up with LXD (will probably migrate to Incus), which spin up large virtual machines and configures them as ephemeral github runners. That means that they will only ever run one job, after which they are automatically removed.

You can create pools of runners that are eagerly spun up, or pools that spin up runners on demand. You can also mix and match pools on multiple infrastructures (some pools on Incus/LXD, others on OpenStack, others on k8s) using the same installation of GARM and target those pools using labels in your github workflows.

If you want to use Incus/LXD, and have access to decently sized servers, you should be able to easily create and manage pools of runners.

erik_lonroth · May 10, 2024, 10:59am

This is certainly interesting! I love to try this out!