[LXD charm] New implementation (Operator Framework)

I’d recommend avoiding the use of configuration options for passing network configuration - make use of Juju spaces instead and query the interface endpoints to get the bound network information - I think I see two ‘provides’ endpoints (https/debug) and one peer endpoint (cluster). Charm users can then bind the LXD charm to specific spaces for each endpoint, supporting more complex L3 routed topology as well as simpler L2 only deployments.

Thanks for the suggestion, I’ve updated the spec accordingly.

Two things I just remembered now that @morphis actively relies on and which should be added:

  • A resource for sideloading a LXD binary
  • A resource for sideloading a full LXD snap

The charm should offer the following endpoints to bind to spaces (covers LXD features to be released this cycle):

  • https (core.https_address)
  • cluster (core.cluster_address)
  • debug (core.debug_address)
  • metrics (core.metrics_address)
  • bgp (core.bgp_address)

The default for all of them (alpha space) will cause no bind to happen at all.
The user can then bind them to any space they want, the charm will then pick the first address from that space and configure LXD accordingly.

The default port for each endpoint will be used, additional configuration keys may be added later on to allow alternative ports.

Changes to those will be applied to the local LXD config live.

There are two exceptions to this:

  • When deploying a cluster, cluster.https_address will get automatically set to the machine’s main global address should no space be provided at deploy time.
  • Again for clusters, the cluster endpoint will be effectively read-only. Attempts to modify it will result in the charm getting blocked and asking the user to undo their action.

@sdeziel @stgraber Overall looks pretty good to me.

The most interesting bit for us will be how we will let the new lxd charm interface with other charms. We came up with a first version of that in GitHub - canonical/charm-lxd-integrator: Charm to enable LXD integrations via Juju relations which we use in Anbox Cloud. Being able to reuse that would be nice. However if it turns out to be not a great fit because of the wider scope we can see how we deal with that from the AMS charm perspective.

Hey @morphis,

So it looks like the equivalent for us here is the https endpoint which we were indeed planning (though need to add to the spec) to offer an interface allowing a related charm to get added to the trust store.

I don’t think we’d use an identical protocol as you have in the integrator charm (sorry!) but would have something close, basically allowing a relation to our https endpoint to specify:

  • X509 certificate to be added
  • List of projects to restrict it to
  • Whether to auto-remove when the relation is removed

In return, we’d provide:

  • List of URLs to reach the LXD server or cluster
  • Current server certificate

I’m a bit confused by trusted_certs_fp, it looks like it’s returning all the trusted certificates but I’m not clear why that’s needed and it feels to me like information leakage which we’d rather not do (as we may segment the certificate list in the future).

On the LXD side, we’d load up the certificate into the trust store and set the name property to juju-relation-<model>-<unit> with :autoremove appended if we want the certificate to be removed when the relation is dropped.

@morphis would that cover your use case for AMS? It wouldn’t be backward compatible with what you have right now, but hopefully you can make your charms support either interface?

Our interface will be lxd-https so we’re at least not re-using the same name :slight_smile:

I’m a bit confused by trusted_certs_fp , it looks like it’s returning all the trusted certificates but I’m not clear why that’s needed and it feels to me like information leakage which we’d rather not do (as we may segment the certificate list in the future).

trusted_certs_fp exists to signal to the requirer that his certificate was added and is maintained by the relation. It’s not exposing any certificates from the LXD trust store but only those which were processed on the relation. That exists to work around the reactive/event nature of charms and relations to know on the requirer side it is fine to proceed with using the given connection. There might be better ways nowadays but this goes all back to when we started charming LXD and certain, now common features, didn’t exist or bugs prevented us from doing it in the proper way.

It’s fine to clean things up and start from scratch just makes things more messy on our end as we then have to deal with three different LXD interfaces and endpoints in a single charm. Ideally we would clean things up and drop a lot of old things but I need to see how we can do that while keeping backward compatibility for our customers to not break any existing deployment.

Ok, so the fact that we’ll be putting out a response with the URL list and current server certificate should be sufficient in this case as it can similarly be used to know that the request was processed.

I think so. If it’s a single write to the relation data after the client certificate was added and the URL list is not provided before the client certificate is provided then that can work. If I recall correct there was a not predictable order back when we did this. It wasn’t predictable which side wrote first when the relation was joined. So either the provider doesn’t write anything until the client has written it’s request or you have something else to signal the request was processed successfully (e.g. trusted_certs_fp)

Yeah, we wouldn’t have the LXD charm do anything until the connecting charm has provided the data, so you’d only get the connection details once we’ve gotten that data and have put the cert into the trust store.

I had another look at the now documented interfaces. Do we really want to leave it with one certificate + projects pair or do we want to allow multiple pairs per relation? If not, we force the relating charm to have an endpoint per certificate it wants to register which might be really specific to how the actual user uses the software (think of a dashboard which has a cert per registered user where each user has access to different projects). Is there a particular reason it is specified this way?

The common case is going to be another application connecting to LXD and wanting API access. Such an application really should have a single certificate and be associated with its model and unit so the audit trail in LXD makes sense.

If an application is going to need to access multiple projects (or all of them), then having a single credential configured as such would be preferable.

In your example of a dashboard acting as multiple user, our recommendation is to not use TLS based auth as this doesn’t actually have the concept of users and instead use candid based auth which does (possibly add RBAC on top).

If that’s not an option, then I’d actually prefer that such an application register as an unrestricted API client with LXD and then use its privileges to add additional restricted certificates to the trust store through the LXD API, making sure to annotate those certificates for its users.

In general the plan for this charm is to not re-implement things that LXD itself already provides through its API. So if an application by design needs to do active trust store management operations, we’d much rather it just asks for a full API cred through the charm and then performs the rest through the LXD API.

The dashboard was more of a quick but unrealistic example. I agree that generally going via the API is the go-to approach.

Thanks for clarifying this. Approved from my side :white_check_mark:

Current spec looks good to me too. :white_check_mark:

1 Like

I think generally this spec looks really good - I have one style comment which you can feel free to ignore but the snap{_config} prefixes on some of the charm options seem a bit superfluous - the charm only deploys using the snap! Less typing when producing bundles and using the CLI as well if these are not used.

I also prefer ‘-’ to ‘_’ and that’s something we’ve tried to standardise on in the OpenStack Charms.

The goal with the snap prefix was to clearly separate what’s a LXD config from what’s coming as a snap config (snap set) while also keeping things nicely sortable. This should allow for possible future expansions of charm config without having to ever deal with conflicts or confusing keys :slight_smile:

For the dash vs underscore part, I don’t personally have an opinion on it, so if we’re trying to standardize on dash, we should go with that then!

Spec marked as approved now.

So as it turns out, this isn’t really possible as there’s no easy way to get the full model+machine config from the charm, so we’ll skip this for now.

As discussed, since Juju makes proxy information available in /etc/juju-proxy.conf, I’ve implemented the core.proxy_ initial setting based on it.

I watched the video of all this which was to say the least IMPRESSIVE!

https://www.youtube.com/watch?v=ix5XMDDkHLA

I’m looking for some help on who implemented the grafana dashboard side of the charm here since I’m trying to develop a charm that does the integration with grafana automatically.

Truly amazing work on this charm. I will for sure start using this down the road.