Taking boot.autostart|stop.priority into account when stopping and starting containers

vosdev · September 25, 2020, 1:40pm

Hey hey,

running lxd 4.6 latest/stable snap

I am playing around with my 3 node LXD cluster and have given some of my containers a boot.autostart.priority and boot.stop.priority together with a boot.autostart.delay

Unfortunately these options are only taken into account when stopping and starting the LXD daemon.

If I want to stop/start multiple containers through the use of --all these priorities are ignored and my applications are not happy that they are up&running before my databases are

If I want this functionality without having to restart the LXD daemon, do I have to talk to the API myself to obtain all the priorities and delays, to then start them manually ?

The priorities come from a profile, like such:

root @ node1 # lxc profile ls | grep prio
| prio.critical     | 1       |
| prio.high         | 3       |
| prio.low          | 5       |
| prio.medium       | 1       |
root @ node1 # lxc profile show prio.low
config:
  boot.autostart.delay: "15"
  boot.autostart.priority: "100"
  boot.stop.priority: "-100"

It’s all in yaml <3 so I could probably write a python script parsing the used_by of the 4 profiles and have them start in the right order. But to fit this in a lxc alias would be really exotic! Time to look at the possibilities of aliases

What do you think about an option that keeps the priorities in mind when using --all?

stgraber · September 25, 2020, 1:51pm

Ah yeah, this is a bit awkward as --all is a client-side only thing which sends one query per instance to the server.

To do what you want we’d need to do one of two things:

Implement an API way to start a batch of instances and have an option on that to respect/bypass the priorities and delays
Have the client itself pull that configuration for all instances it’s considering and then apply the same logic the server would during startup

I’m generally not fond of adding that much logic to the client as the client can be on a very different version than the server and so the logic may end up being different.

Server-side batch state changes on instances would be interesting, I’m just not sure how to fit it in the API in a clean way yet

vosdev · September 25, 2020, 2:08pm

Thanks for the incredibly quick and complete response!

I am not familiar with the codebase so I am just saying what comes to mind. There is a procedure that starts or stops the containers in order keeping the priorities in mind that gets called when the daemon stops or starts. Is that a function that you could re-use ?

stgraber · September 25, 2020, 2:14pm

Yeah, we could re-use that logic and share it with the client, the problem is that the client can talk with any LXD version, so we may end up having logic in the client which differs from that on the server.

That’s why I’d prefer it be done server side with a new API to do bulk operations.
The logic of such an API is pretty trivial, figuring out its URL is where I’m getting stuck

For instances, the state API is at /1.0/instances/<NAME>/state
Ideally, you’d want something like /1.0/instances/state which affects all instances, but that would conflict with an instance named state, so this URL isn’t available to us.

One option may be to implement a PUT operation against /1.0/instances instead taking a struct similar to what /1.0/instances/<NAME/state does.

stgraber · September 25, 2020, 2:18pm

vosdev · September 25, 2020, 2:51pm

Bulk stop/start outside of --all would also allow for a filter option on start/stop/pause/resume like we have on lxc list

stgraber · September 25, 2020, 3:13pm

Yeah, probably not done initially but now that we have server side collection filtering (which we still need lxc list to use…), that API could support it and allow for those filters to be used with start/stop/restart/pause server-side.