Lxc exec randomly swallows some output

I recently started using lxd 5.0.1 (the default on Ubuntu 22.04) and I noticed that sometimes lxc exec swallows some output. For instance, I run lxc exec somecontainer -- rm -rv /some/dir, but I don’t see any output, even though the command did execute successfully. This is especially annoying when you run vim via exec, as sometimes the screen doesn’t get reset properly after exiting, then I have to issue reset to get my terminal “normal” again. I have never experienced this issue on lxd 3.x on Ubuntu 18.04. Any idea what’s causing this?

Do you see this on lxd 5.0.2 released last week?

I don’t have this update yet. snap refresh --list says “All snaps up to date.” I am on the branch “5.0/stable/ubuntu-22.04”.

Try refreshing to 5.0/candidate channel

Is it safe to switch? Apparently, I would lose the “ubuntu-22.04” branch when going to the candidate channel. I don’t know what the purpose of that branch is.

I’m not sure, I’ve not come across the 5.0/stable/ubuntu-22.04 channel.

@stgraber do you know what this snap channel 5.0/stable/ubuntu-22.04 is? And is it safe/equivalent to 5.0/stable?

It’s the same, yes.

The /ubuntu-XYZ is to comply with Ubuntu’s policy on bundled snaps and is meant as a way to push an emergency snap update to a specific set of Ubuntu users.

1 Like

I have upgraded to version 5.0.2, but the problem still persists.

$ lxc exec testcontainer -- mktemp -d
/tmp/tmp.4NdKcXZ53H
$ lxc exec testcontainer -- rmdir -v /tmp/tmp.4NdKcXZ53H
$

I just did this three times in a row and the third time it didn’t print “rmdir: removing directory, '/tmp/tmp.4NdKcXZ53H”

Edit
One more thing I just noticed: partial output is also possible. I just ran ls -l /tmp to verify that the directory was actually removed and it just printed “total 0” and then it cuts off before the newline even, whereas it should have printed total 0, newline, and then a bunch of lines with other files.

What is the image running in the instance?

I’ve not been able to reproduce this, can you reproduce while at the same time running this in another window and then provide the output:

lxc monitor --type=logging --pretty

It’s a standard Ubuntu image started with lxc launch ubuntu:22.04 testcontainer. I captured a run with missing output as you requested:

DEBUG  [2023-01-24T19:03:07+01:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0 username=redacted
DEBUG  [2023-01-24T19:03:07+01:00] Event listener server handler started         id=e64c290f-cb64-4876-aab7-9476be082c44 local=/var/snap/lxd/common/lxd/unix.socket remote=@
DEBUG  [2023-01-24T19:03:07+01:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0/events username=redacted
DEBUG  [2023-01-24T19:03:07+01:00] Handling API request                          ip=@ method=POST protocol=unix url=/1.0/instances/testcontainer/exec username=redacted
DEBUG  [2023-01-24T19:03:07+01:00] New operation                                 class=websocket description="Executing command" operation=93bb9536-517c-4148-80c1-0bb177dfc866 project=default
DEBUG  [2023-01-24T19:03:07+01:00] Started operation                             class=websocket description="Executing command" operation=93bb9536-517c-4148-80c1-0bb177dfc866 project=default
DEBUG  [2023-01-24T19:03:07+01:00] Waiting for exec websockets to connect
DEBUG  [2023-01-24T19:03:07+01:00] Handling API request                          ip=@ method=GET protocol=unix url="/1.0/operations/93bb9536-517c-4148-80c1-0bb177dfc866/websocket?secret=d5e4ad53680b3b48876d5702b64230037d6591229782373db33ca3f91f833608" username=redacted
DEBUG  [2023-01-24T19:03:07+01:00] Connecting to operation                       class=websocket description="Executing command" operation=93bb9536-517c-4148-80c1-0bb177dfc866 project=default
DEBUG  [2023-01-24T19:03:07+01:00] Connected to operation                        class=websocket description="Executing command" operation=93bb9536-517c-4148-80c1-0bb177dfc866 project=default
DEBUG  [2023-01-24T19:03:07+01:00] Handling API request                          ip=@ method=GET protocol=unix url="/1.0/operations/93bb9536-517c-4148-80c1-0bb177dfc866/websocket?secret=53de15b6ffad68d31224222dc6b590a4563e93701f4a03a4400b9607dc5830e8" username=redacted
DEBUG  [2023-01-24T19:03:07+01:00] Connecting to operation                       class=websocket description="Executing command" operation=93bb9536-517c-4148-80c1-0bb177dfc866 project=default
DEBUG  [2023-01-24T19:03:07+01:00] Connected to operation                        class=websocket description="Executing command" operation=93bb9536-517c-4148-80c1-0bb177dfc866 project=default
DEBUG  [2023-01-24T19:03:07+01:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0/operations/93bb9536-517c-4148-80c1-0bb177dfc866 username=redacted
DEBUG  [2023-01-24T19:03:07+01:00] Retrieved PID of executing child process      attachedPid=14481 instance=testcontainer instanceType=container project=default
DEBUG  [2023-01-24T19:03:07+01:00] Exec control handler finished                 PID=14481 instance=testcontainer interactive=true project=default
DEBUG  [2023-01-24T19:03:07+01:00] Instance process started                      PID=14481 instance=testcontainer interactive=true project=default
DEBUG  [2023-01-24T19:03:07+01:00] Exec mirror websocket started                 PID=14481 instance=testcontainer interactive=true number=0 project=default
DEBUG  [2023-01-24T19:03:07+01:00] Exec control handler started                  PID=14481 instance=testcontainer interactive=true project=default
DEBUG  [2023-01-24T19:03:07+01:00] Instance process stopped                      PID=14481 exitStatus=0 instance=testcontainer interactive=true project=default
DEBUG  [2023-01-24T19:03:07+01:00] Sending write barrier
DEBUG  [2023-01-24T19:03:07+01:00] Detected poll(POLLIN | POLLPRI | POLLHUP | POLLRDHUP) event
DEBUG  [2023-01-24T19:03:07+01:00] Detected closed channel: exiting
DEBUG  [2023-01-24T19:03:07+01:00] Success for operation                         class=websocket description="Executing command" operation=93bb9536-517c-4148-80c1-0bb177dfc866 project=default
DEBUG  [2023-01-24T19:03:07+01:00] DefaultWriter got error getting next reader   err="websocket: close 1006 (abnormal closure): unexpected EOF"
DEBUG  [2023-01-24T19:03:07+01:00] Exec mirror websocket finished                PID=14481 instance=testcontainer interactive=true number=0 project=default
DEBUG  [2023-01-24T19:03:07+01:00] Updated metadata for operation                class=websocket description="Executing command" operation=93bb9536-517c-4148-80c1-0bb177dfc866 project=default
DEBUG  [2023-01-24T19:03:07+01:00] Event listener server handler stopped         listener=e64c290f-cb64-4876-aab7-9476be082c44 local=/var/snap/lxd/common/lxd/unix.socket remote=@

Two additional observations: the “abnormal closure” message appears regardless of whether the output is complete or not. (I guess this is normal?) Secondly, the machine on which I initially noticed the problem was a bit slow/laggy. I don’t seem to be able to reproduce it on a faster/smoothly running system.

One more observation: it seems that execs with complete output always print “Detected poll(POLLHUP) event: exiting” whereas execs with missing/partial output print “Detected closed channel: exiting” instead.

I am still experiencing this issue occasionally on various servers. Any updates? Does the log reveal anything interesting?

Apologies. I’ve assigned this to me to take a look at when I get a chance. But we are very busy with roadmap work currently.

This PR should fix the issue and will be in LXD 5.14

1 Like

That’s great news! Thank you very much. I trust this will be backported into LXD 5.0 LTS?

Yes it will.