LXD API: cannot forcefully stop a container if an SFTP session is open

Hi there,

There is a bit of unexpected API’s behaviour when I’m trying to forcefully stop a container instance whilst having an open SFTP session for the given instance. The UpdateInstanceState operation never returns on Wait(). It is fixable if I close the SFTP session prior to stopping the container.

On one hand, that seems reasonable (?) if the stop is not forceful given the open connection that we might not want to touch. However, when doing it forcefully, I would expect the container to shutdown regardless. I’ve created a minimal example to reproduce the issue, see below. The LXD comes from “latest/stable” (5.12).

Is it an expected behaviour?

package main

import (
	"fmt"

	lxd "github.com/lxc/lxd/client"
	"github.com/lxc/lxd/shared/api"
)

const LxdSock = "/var/snap/lxd/common/lxd/unix.socket"

func main() {
	if srv, err := lxd.ConnectLXDUnix(LxdSock, nil); err == nil {
		imageServer, err := lxd.ConnectSimpleStreams("https://cloud-images.ubuntu.com/releases/", nil)
		if err != nil {
			panic(err)
		}

		alias, _, err := imageServer.GetImageAlias("22.04/amd64")
		if err != nil {
			panic(err)
		}

		image, _, err := imageServer.GetImage(alias.Target)
		if err != nil {
			panic(err)
		}

		req := api.InstancesPost{
			Name: "sftp-issue",
			Type: api.InstanceType("container"),
			Source: api.InstanceSource{
				Type:        "image",
				Fingerprint: image.Fingerprint,
				Project:     "default",
			},
		}
		op, err := srv.CreateInstanceFromImage(imageServer, *image, req)
		if err != nil {
			panic(err)
		}
		op.Wait()

		start := api.InstanceStatePut{
			Action:  "start",
			Timeout: 5,
			Force:   true,
		}

		upd, err := srv.UpdateInstanceState("sftp-issue", start, "")
		if err != nil {
			panic(err)
		}
		upd.Wait()

		sftp, err := srv.GetInstanceFileSFTP("sftp-issue")
		if err != nil {
			panic(err)
		}
		wd, _ := sftp.Getwd()
		fmt.Println(wd)

		stop := api.InstanceStatePut{
			Action:  "stop",
			Timeout: 5,
			Force:   true,
		}

		upd, err = srv.UpdateInstanceState("sftp-issue", stop, "")
		if err != nil {
			panic(err)
		}

		// sftp.Close() <-- uncomment it here to work
		err = upd.Wait()
		if err != nil {
			panic(err)
		}
		fmt.Print("We'll never be here unless we call sftp.Close() before Wait()")
	}
}

I gave it a brief look out of curiosity, it seems that the problem is that on stopping, the stopForkfile() method cannot get the lock which is captured by the SFTP server here: https://github.com/lxc/lxd/blob/master/lxd/instance/drivers/driver_lxc.go#L6634

LXD kills the operation after 5 mins then, which moves the instance state to STOPPED, but it does not seem to stop properly in this case.

I did not go any further and am using a workaround for this by making sure the SFTP connection is closed right after every usage. Would appreciate any guidance on the issue, thanks!

This seems like a genuine bug, would you be so kind to report it at https://github.com/lxc/lxd/issues, please.

Thanks, @sdeziel, posted it here.

2 Likes