Dpkg hangs trying to configure


(Jim Lynch) #1

This is a Debian 8 container running on a Mint 17 64 bit host. I did an apt-get update followed by an apt-get upgrade. It just stopped doing anything when it got to setting up “at”. I eventually used ^C to get out. Rebooted the container and did a dpkg --configure -a since I wasn’t able to do anything with apt-get. I attempted to purge at but that hung also, however it appears to be purged.
From dpkg -l I see:
pF at 3.1.16-1 amd64 …
Now it is stuck at Setting up exim4-daemon-light (4.84.2-2+deb8u4) forever. Even after rebooting.

I would like to install strace and run it against dpkg to see what’s up but that’s possible.

I’ve installed Debian about 3 times now as an LXD container and have had similar problems with apt-get install and apt-get upgrade hanging while setting up various packages.

Does anyone have any ideas?
Thanks,
Jim.


#2

I would gladly try to replicate.

Here is what I tried,

$ lxc launch images:debian/jessie debian
Creating debian
Starting debian    
$ lxc exec debian -- bash
root@debian:~# apt update
...
root@debian:~# apt upgrade
...
The following packages will be upgraded:
  libdns-export100 libirs-export91 libisc-export95 libisccfg-export90
  sensible-utils
...
root@debian:~# exit

Then,

$ lxc image show images:debian/jessie
auto_update: false
properties:
  architecture: amd64
  description: Debian jessie amd64 (20180204_04:26)
  os: Debian
  release: jessie
  serial: "20180204_04:26"
public: true

There is a chance that you got a bad image, and this new one has fixed the issue.
Therefore, if you still have that bad Debian container, try out

$ lxc config show debian
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Debian jessie amd64 (20180204_04:26)
  image.os: Debian
  image.release: jessie
  image.serial: "20180204_04:26"
...

which tells you which image you actually have been using.


(Jim Lynch) #3

I did a reinstall and everything worked fine WRT the upgrade, however when I went to install the app that I’m trying to get to run in the container, it hung part way through:

Setting up vim (2:7.4.488-7+deb8u3) ...
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/vim (vim) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/vimdiff (vimdiff) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/rvim (rvim) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/rview (rview) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/vi (vi) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/view (view) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/ex (ex) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/editor (editor) in auto mode
Setting up haveged (1.9.1-1) ...

It’s been setting up haveged for over an hour. I think I can consider it hung. This is the 4th time I’ve been installing something (or upgrading) that it hung. I have no idea how to recover. I can’t control-c out of it.
I brought up another console and see:

root@debian:~# ps -efl
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S root         1     0  0  80   0 -  7054 hrtime 11:33 ?        00:22:07 /sbin/init
4 S root        23     0  0  80   0 -  5882 wait   11:33 ?        00:00:00 bash
4 S root        46     1  0  80   0 -  8242 ep_pol 11:33 ?        00:00:09 /lib/systemd/systemd-journald
1 S root        95     1  0  80   0 -  6382 poll_s 11:34 ?        00:00:00 dhclient -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0
4 S root       117     1  0  80   0 -  3560 wait_w 11:34 console  00:00:00 /sbin/agetty --noclear --keep-baud console 115200 38400 9600 linux
0 S root      4532    23  0  80   0 -  1085 wait   11:38 ?        00:00:00 /bin/sh ./install.sh
0 S root      5109  4532  0  80   0 -  1085 wait   11:40 ?        00:00:00 /bin/sh resources/fusionpbx.sh
4 S root      5116  5109  0  80   0 - 17575 poll_s 11:40 ?        00:16:07 apt-get install -y --force-yes vim git dbus haveged ssl-cert
0 S root      5238  5116  0  80   0 -  4927 wait   11:41 pts/0    00:00:00 /usr/bin/dpkg --status-fd 18 --configure libgpm2:amd64 libcap-ng0:amd64 libdbus-1-3:amd64 libhavege1:amd64 vim-common:amd64 dbus:amd64 ssl-cert:al
0 S root      5408  5238  0  80   0 -  1085 wait   11:42 pts/0    00:00:00 /bin/sh /var/lib/dpkg/info/haveged.postinst configure 
0 S root      5438  5408  0  80   0 -  1085 wait   11:42 pts/0    00:00:00 /bin/sh /usr/sbin/invoke-rc.d haveged start
0 S root      5468  5438  0  80   0 -  6044 poll_s 11:42 pts/0    00:00:00 systemctl start haveged.service
0 S root      5469  5468  0  80   0 -  3295 poll_s 11:42 pts/0    00:00:00 /bin/systemd-tty-ask-password-agent --watch
4 S root      5470     0  0  80   0 -  5882 wait   14:38 ?        00:00:00 bash
0 R root      5474  5470  0  80   0 -  5191 -      14:39 ?        00:00:00 ps -efl

And I rebooted the system, did an apt update and received the message:

E: dpkg was interrupted, you must manually run 'dpkg --configure -a' to correct the problem. 

And as was the case in all the other attempts, dpkg hangs forever

root@debian:~# dpkg --configure -a
Setting up haveged (1.9.1-1) ...

I’ve never been able to recover from this.

root@debian:~#lxc config show debian
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Debian jessie amd64 (20180206_03:08)
  image.os: Debian
  image.release: jessie
  image.serial: "20180206_03:08"
...

(Stéphane Graber) #4
0 S root      5468  5438  0  80   0 -  6044 poll_s 11:42 pts/0    00:00:00 systemctl start haveged.service

So looks like systemd is stuck trying to start this service.

systemctl status haveged.service

May be useful to see what’s going on. The last few lines of the journal may also help:

journalctl -n 30

(Jim Lynch) #5

OK the journalctl command told me things weren’t right. When I looked for a soulution I found this command:

lxc config set guest ‘security.privileged’ true

Which I ran on the host. This is debian running in a LXD container. That fixed the problem.

Thanks,
Jim


(Stéphane Graber) #6

Ok, so it’s systemd not being happy with those units in an unprivileged container.

It’d still be useful to know what the error was as @brauner and others have been making some progress fixing such issues in systemd before.


(Jim Lynch) #7

The error I think I searched on to find the solution was “Looping too
fast. Throttling execution a little”. There were other errors in the
output, but that one was being tossed out fairly frequently. I can’t
imagine I’m the only one experience the phenomenon I destroyed and
recreated 4 containers before I found the problem and each time, it
decided to quit in different places. The second time all I did was try
to install openssh-server and it hung. I didn’t even get to the step
where was installing my main app.

Jim.