I want to automate creation of lxd container based on ubuntu images, with custom settings and scripts. Naturally, I tried using the cloud-init settings in lxd profiles, which worked well for ubuntu:18.04.
Unfortunately, with the newer images, ubuntu:18.10, ubuntu-daily:18.10, ubuntu-daily:19.04, cloud-init does no longer work (which can be verified, e.g., using these steps).
After some investigations, I figured that the cloud-init services are blocked by snapd.seeded.service
$ systemctl list-jobs
JOB UNIT TYPE STATE
122 cloud-config.service start waiting
107 snapd.autoimport.service start waiting
2 multi-user.target start waiting
121 cloud-init.target start waiting
1 graphical.target start waiting
127 cloud-final.service start waiting
86 systemd-update-utmp-runlevel.service start waiting
105 snapd.seeded.service start running
$ less /lib/systemd/system/cloud-config.service
[Unit]
Description=Apply the settings specified in cloud-config
After=network-online.target cloud-config.target
After=snapd.seeded.service
Wants=network-online.target cloud-config.target
[Service]
Type=oneshot
ExecStart=/usr/bin/cloud-init modules --mode=config
RemainAfterExit=yes
TimeoutSec=0
# Output needs to appear in instance console output
StandardOutput=journal+console
[Install]
WantedBy=cloud-init.target
Maybe the issue is caused by this bug, I am not sure. I use the latest version of snap (within the container) for which this bug should have been fixed:
$ snap version
snap 2.38+19.04
snapd 2.38+19.04
series 16
ubuntu 19.04
kernel 4.14.98-v7+
Or may be it is because snap does not work well in unprivileged containers? Is there a workaround? I could manually remove snapd from the container, but I really like to automate this using profiles.
My first view is that it’s not snap that may block cloud-init but rather that the snap.seeded target does not complete in order for cloud-init to take over and do its work.
according to this, the snap.seeded is just a flag used to signal that snapd does not work yet.
From a few threads on askubuntu it seems that sometimes snaps does not initialize correctly first time and it’s needed to either restart the computer or even install the hello world snap so that the core installation finishes. Never happened to me but not unthinkable nonetheless.
If the OP don’t need snapd in his containers, he could just create a new image with a container from which snapd had been uninstalled.
My host is Raspbian Stretch on RPi 3 B+.
I just tried on another machine (Ubuntu 18.04 i386) with ubuntu-daily:19.04, and surprisingly, everything went fine.
$ snap version
snap 2.38+19.04
snapd 2.38+19.04
series 16
ubuntu 19.04
kernel 4.15.0-45-generic
So it could be a platform-specific issue. Maybe the kernel on RPi is too old and/or lack some features? Any hints how to diagnose?
Creating a new image with uninstalled snapd could be a good idea. Would cloud-init work in such an image? In my understanding it runs only once when the container has been created, and this has already happened in for the created image.
I never created images before, so I don’t now what part of a container is saved in the images.
cloud-init should somehow remember that it already has run so that it dos not generate, e.g., new ssh keys on every reboot. Would this information be saved in the image?
I just tried creating an image from a container with the removed snapd
but, interestingly, it looks like cloud-init has 2 boot records:
$ lxc exec test-no-snapd -- cloud-init analyze show
...
2 boot records analyzed
Are log files saved in images?
I created another container out of the same image, and at least they seem to have different generated ssh keys and mac addresses. The log file /var/log/cloud-init-output.log starts the same for both of the containers (which part is probably taken from the parent container from which the image was built) but then they become different. Ssh keys and mac addresses seem to be generated two times according to these logs. Anyway, it looks indeed that cloud-init for images created out of containers works as expected.
everything is saved, you have to clean up yourself. If you have personal files, unencrypted password files and credit card account numbers in a container, it’s best to remove them before turning it into an image for distribution on the internet.
Note that cloud-init is supposed to run once. When you create a new container image from an existing container, you need to clean up the cloud-init files so that cloud-init runs once in the new container as well.
I did not clean up any files, but it seems that cloud-init still did its job. It ran all initialisation scripts (modules) to set up ssh keys, network addresses, etc. It ran my user-data script. And it did not repeat that upon the next reboot. It is unclear to me how cloud-init has detected that the new container is different from the container from which the image was created, if they suppose to have the same files.
the service itself is certainly not disabled once it has run, it’s easy to see (by looking at syslog) it runs again when the container is restarted.
now in /var/log/cloud-init.log
2019-04-17 12:20:08,859 - util.py[DEBUG]: Read 6 bytes from /var/lib/cloud/data/instance-id
2019-04-17 12:20:08,859 - stages.py[DEBUG]: previous iid found to be test1
2019-04-17 12:20:08,860 - util.py[DEBUG]: Writing to /var/lib/cloud/data/instance-id - wb: [644] 6 bytes
2019-04-17 12:20:08,862 - util.py[DEBUG]: Writing to /run/cloud-init/.instance-id - wb: [644] 6 bytes
2019-04-17 12:20:08,863 - util.py[DEBUG]: Writing to /var/lib/cloud/data/previous-instance-id - wb: [644] 6 bytes
2019-04-17 12:20:08,867 - util.py[DEBUG]: Writing to /var/lib/cloud/instance/obj.pkl - wb: [400] 6526 bytes
2019-04-17 12:20:08,870 - main.py[DEBUG]: [net] init will now be targeting instance id: test1. new=False
so it seems that it relies simply on the host name. That’s probably why you can’t change it from lxd.
So i’d guess that if you delete or rename your original container, and create a new container from the published image with the same name as the original container, cloud-image will not run.
Indeed the hostname does not get renamed when renaming the container. Then why/how it gets renamed when the container is created from an image? Presumably /etc/hostname is already present in the image if it is created from an old container.
Looking a bit at cloudinit, it runs indeed at each container start, and decides what to do according to some rules. Nothing is very clear for me ATM, but it seems that the full new container handling is decided on detection of change of the ‘iid’. Is the iid just the container name, or something else I’m not sure. i have seen also that the network handling by cloud init runs indeed for a new container, but it can also run for some network configuration changes passed by the host.
Whatever the means the host use to pass the init files (I have seen references to /run in the log), the handling by cloudinit has nothing magical, it’s a bunch of json files that are read as any old files.
For anyone interested, the handling is in python modules under dist-packages/cloudinit.