Init containers with predefined requirements

Hi,

[LONG FOR CLARITY]

We’ve been using KVM libvirt to provide virtualization in our dev, tst and stg environments.
While this has proven to be very stable and has worked for a long time, we would like to improve the speed at which we can
scafold these evironemnts up and down during our development and testing cycles.

As a start, I am trying to automate install and config of lxd containers on localhost for purposes of testing.

A complete system is far to complicated for any system admin to reconfigure manually in a short space of time.
So, ansible is being used to setup, configure and manage the complexities of the whole system so that we can reproduce a consistant
and predicatable system in a very short space of time.

For clarity of the steps I will post all the steps taken below.
While many requirements specified are defined as we expect the following two are not as required:

  1. containers not using the ipv4 address specified to them
    The ip addresses must be consistant so that:
    a) ansible knows where to find each host
    b) services on different hosts know were to find each other (e.g clustered services such as rabbit, redis, mariadb)

  2. cloud-init directives not beig executed on first launch
    We need to have a user account that ansible can use over ssh

Both of these challenges may be just because of my lack of understanding, but they block me anyway after many hours research and testing.
So I am hoping somebody could enlighten me :slight_smile:

Enviroment

$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04 LTS"

$ uname -a
Linux sean-wks 4.15.0-24-generic #26-Ubuntu SMP Wed Jun 13 08:44:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ ansible --version
ansible 2.5.1
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/sean/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/dist-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.15rc1 (default, Apr 15 2018, 21:51:34) [GCC 7.3.0]

$ lxd --version
3.0.1

$ apt-cache policy lxd
lxd:
  Installed: 3.0.1-0ubuntu1~18.04.1
  Candidate: 3.0.1-0ubuntu1~18.04.1
  Version table:
 *** 3.0.1-0ubuntu1~18.04.1 500
        500 http://za.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     3.0.0-0ubuntu4 500
        500 http://za.archive.ubuntu.com/ubuntu bionic/main amd64 Packages

At the end of playbook everything passes so I can say it is not an ansible problem.

PLAY RECAP *********************************************************************************************
localhost                  : ok=12   changed=7    unreachable=0    failed=0   

And we have 10 containers created in under a minute

$ lxc list --fast
+------+---------+--------------+----------------------+----------+------------+
| NAME |  STATE  | ARCHITECTURE |      CREATED AT      | PROFILES |    TYPE    |
+------+---------+--------------+----------------------+----------+------------+
| lxc0 | STOPPED | x86_64       | 2018/07/21 08:47 UTC | default  | PERSISTENT |
+------+---------+--------------+----------------------+----------+------------+
| lxc1 | STOPPED | x86_64       | 2018/07/21 08:47 UTC | default  | PERSISTENT |
+------+---------+--------------+----------------------+----------+------------+
| lxc2 | STOPPED | x86_64       | 2018/07/21 08:48 UTC | default  | PERSISTENT |
+------+---------+--------------+----------------------+----------+------------+
| lxc3 | STOPPED | x86_64       | 2018/07/21 08:48 UTC | default  | PERSISTENT |
+------+---------+--------------+----------------------+----------+------------+
| lxc4 | STOPPED | x86_64       | 2018/07/21 08:48 UTC | default  | PERSISTENT |
+------+---------+--------------+----------------------+----------+------------+
| lxc5 | STOPPED | x86_64       | 2018/07/21 08:48 UTC | default  | PERSISTENT |
+------+---------+--------------+----------------------+----------+------------+
| lxc6 | STOPPED | x86_64       | 2018/07/21 08:48 UTC | default  | PERSISTENT |
+------+---------+--------------+----------------------+----------+------------+
| lxc7 | STOPPED | x86_64       | 2018/07/21 08:48 UTC | default  | PERSISTENT |
+------+---------+--------------+----------------------+----------+------------+
| lxc8 | STOPPED | x86_64       | 2018/07/21 08:48 UTC | default  | PERSISTENT |
+------+---------+--------------+----------------------+----------+------------+
| lxc9 | STOPPED | x86_64       | 2018/07/21 08:48 UTC | default  | PERSISTENT |
+------+---------+--------------+----------------------+----------+------------+

The resulting containers look as follows:

$ lxc config show lxc0
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 16.04 LTS amd64 (release) (20180703)
  image.label: release
  image.os: ubuntu
  image.release: xenial
  image.serial: "20180703"
  image.version: "16.04"
  user.user-data: "#cloud-config\n\n# Update apt database on first boot\n# (ie run
    apt-get update)\n\npackage_update: true\n\n# Upgrade the instance on first boot\n#
    (ie run apt-get upgrade)\n\npackage_upgrade: true\n\n# Add netadmin user to the
    system on first boot \n\nusers:\n- name: netadmin\n    groups: [adm, audio, cdrom,
    dialout, floppy, video, plugdev, dip, netdev, sudo]\n    sudo: ALL=(ALL) NOPASSWD:ALL\n
    \   passwd: <snip>hash password</snip>.\n
    \   ssh_authorized_keys: <snip>pub key</snip>\n"
  volatile.apply_template: create
  volatile.base_image: f2228450779fee27020d6024af587379b8f51062c32a335327f2b028c924bfa1
  volatile.eth0.hwaddr: 00:16:3e:4a:83:10
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":231072,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":231072,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":231072,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":231072,"Nsid":0,"Maprange":65536}]'
devices:
  eth0:
    ipv4.address: 10.0.3.10
    nictype: bridged
    parent: lxdbr0
    security.mac_filtering: "true"
    type: nic
ephemeral: false
profiles:
- default
stateful: false
description: ""

We start a container to check if everything is as we expect:
Expect:

  • lxc0 to be 10.0.3.10
  • apt update && apt upgrade to have run
  • netadmin exists

Results:

  • FAILED
  • FAILED
  • FAILED

$ lxc start lxc0

$ lxc list
+------+---------+-------------------+------+------------+-----------+
| NAME |  STATE  |       IPV4        | IPV6 |    TYPE    | SNAPSHOTS |
+------+---------+-------------------+------+------------+-----------+
| lxc0 | RUNNING | 10.0.3.249 (eth0) |      | PERSISTENT | 0         |
+------+---------+-------------------+------+------------+-----------+
| lxc1 | STOPPED |                   |      | PERSISTENT | 0         |
+------+---------+-------------------+------+------------+-----------+
| lxc2 | STOPPED |                   |      | PERSISTENT | 0         |
+------+---------+-------------------+------+------------+-----------+
| lxc3 | STOPPED |                   |      | PERSISTENT | 0         |
+------+---------+-------------------+------+------------+-----------+
| lxc4 | STOPPED |                   |      | PERSISTENT | 0         |
+------+---------+-------------------+------+------------+-----------+
| lxc5 | STOPPED |                   |      | PERSISTENT | 0         |
+------+---------+-------------------+------+------------+-----------+
| lxc6 | STOPPED |                   |      | PERSISTENT | 0         |
+------+---------+-------------------+------+------------+-----------+
| lxc7 | STOPPED |                   |      | PERSISTENT | 0         |
+------+---------+-------------------+------+------------+-----------+
| lxc8 | STOPPED |                   |      | PERSISTENT | 0         |
+------+---------+-------------------+------+------------+-----------+
| lxc9 | STOPPED |                   |      | PERSISTENT | 0         |
+------+---------+-------------------+------+------------+-----------+

$ lxc exec lxc0 bash
root@lxc0:~#

root@lxc0:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
15: eth0@if16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:4a:83:10 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.3.249/24 brd 10.0.3.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe4a:8310/64 scope link 
       valid_lft forever preferred_lft forever

root@lxc0:~# tail /var/log/apt/history.log 

Start-Date: 2018-07-03  08:31:33
Commandline: apt-get --purge remove --assume-yes ^linux-.* linux-base+
Purge: linux-headers-generic:amd64 (4.4.0.130.136), linux-headers-4.4.0-130:amd64 (4.4.0-130.156), linux-headers-4.4.0-130-generic:amd64 (4.4.0-130.156), linux-virtual:amd64 (4.4.0.130.136), linux-image-4.4.0-130-generic:amd64 (4.4.0-130.156), linux-headers-virtual:amd64 (4.4.0.130.136), linux-image-virtual:amd64 (4.4.0.130.136)
End-Date: 2018-07-03  08:31:36

Start-Date: 2018-07-03  08:31:38
Commandline: apt-get --purge remove --assume-yes ^grub-.*
Purge: os-prober:amd64 (1.70ubuntu3.3), grub-common:amd64 (2.02~beta2-36ubuntu3.18), grub2-common:amd64 (2.02~beta2-36ubuntu3.18), ubuntu-server:amd64 (1.361.1), grub-legacy-ec2:amd64 (18.2-4-g05926e48-0ubuntu1~16.04.2), grub-pc:amd64 (2.02~beta2-36ubuntu3.18), grub-pc-bin:amd64 (2.02~beta2-36ubuntu3.18), grub-gfxpayload-lists:amd64 (0.7)
End-Date: 2018-07-03  08:31:43

root@lxc0:~# id -u netadmin
id: ‘netadmin’: no such user
root@lxc0:~#

Ansible tasks

# tasks file for ansible-testnet
---

- name: install lxd
  apt: name={{ item }}
  with_items:
    - lxd
  become: yes
  
# at lxd install the user was added to the lxd group
# change only takes effect after reboot or newgrp lxd is run
- name: ensure user is in group lxd
  command: newgrp lxd
  
- name: check if lxd init already done
  stat: path=/var/lib/lxd/storage-pools/default
  register: lxd_present
  
- name: copy preseed script
  copy: src=preseed.sh dest=/tmp/preseed.sh mode=0755
  when: lxd_present.stat.exists == False
  
- name: execute preseed script
  shell: /tmp/preseed.sh
  when: lxd_present.stat.exists == False

- name: list contianers
  command: lxc list --format csv
  register: containers
  
- name: init containers
  command: lxc init {{ item.distro }}:{{ item.release }}/{{ item.arch }} {{ item.hostname }}
  when: item.hostname != "" and item.hostname not in containers.stdout
  with_items: "{{ testnet_hosts }}"
  
- name: attach containers to network
  command: lxc network attach lxdbr0 {{ item.hostname }} eth0
  when: item.hostname != "" and item.hostname not in containers.stdout
  with_items: "{{ testnet_hosts }}"
  
- name: configure containers address
  command: lxc config device set {{ item.hostname }} eth0 ipv4.address {{ item.network }}.{{ item.count }}
  when: item.hostname != "" and item.hostname not in containers.stdout
  with_items: "{{ testnet_hosts }}"
  
- name: configure containers hwaddr
  command: lxc config set {{ item.hostname }} volatile.eth0.hwaddr {{ item.mac }}
  when: item.hostname != "" and item.hostname not in containers.stdout
  with_items: "{{ testnet_hosts }}"

### prevent the container from ever changing its MAC address 
### or forwarding traffic for any other MAC address (such as nesting
- name: configure mac filtering on containers
  command: lxc config device set {{ item.hostname }} eth0 security.mac_filtering true
  when: item.hostname != "" and item.hostname not in containers.stdout
  with_items: "{{ testnet_hosts }}"
  
### use cloud-init directives to run commands or scripts at the 
### first boot cycle when launched 
- name: deploy cloud-init template
  template: src=config.yml.j2 dest=/tmp/config.yml mode=0755
  
- name: provision launch
  shell: lxc config set {{ item.hostname }} user.user-data - < /tmp/config.yml
  when: item.hostname != "" and item.hostname not in containers.stdout
  with_items: "{{ testnet_hosts }}"

preseed.sh

#!/bin/bash
cat <<EOF | lxd init --preseed
config:
  core.https_address: '[::]:8443'
  core.trust_password: "12345678"
cluster: null
networks:
- config:
    ipv4.address: 10.0.3.1/24
    ipv4.nat: "true"
    ipv6.address: none
  description: ""
  managed: false
  name: lxdbr0
  type: ""
storage_pools:
- config:
    size: 100GB
  description: ""
  name: default
  driver: btrfs
profiles:
- config: {}
  description: ""
  devices:
    eth0:
      name: eth0
      nictype: bridged
      parent: lxdbr0
      type: nic
    root:
      path: /
      pool: default
      type: disk
  name: default
EOF

cloud-init config.yml

#cloud-config

# Update apt database on first boot
# (ie run apt-get update)

package_update: true

# Upgrade the instance on first boot
# (ie run apt-get upgrade)

package_upgrade: true

# Add netadmin user to the system on first boot 

users:
- name: {{ users_admin[0].username }}
    groups: [adm, audio, cdrom, dialout, floppy, video, plugdev, dip, netdev, sudo]
    sudo: ALL=(ALL) NOPASSWD:ALL
    passwd: {{ users_admin[0].password }}
    ssh_authorized_keys: {{ users_admin[0].sshkey_pub }}

Further investigation revealed, either strange behaviour or a gap in my understanding.
However, I do have a method to get what I want. Just not a nice as I would like it.
But it works … until I can learn what I need to do to make this work as per my original post.

  1. Remove all containers and repeat steps

  2. Before starting an container check that the bridge dnsmasq has correct settings for the containers

    cat /var/lib/lxd/networks/lxdbr0/dnsmasq.hosts/lxc*

    00:16:3e:1f:df:d7,10.0.3.10,lxc0
    00:16:3e:d6:f9:54,10.0.3.11,lxc1
    00:16:3e:f6:e1:91,10.0.3.12,lxc2
    00:16:3e:64:f6:ac,10.0.3.13,lxc3
    00:16:3e:c3:3c:8f,10.0.3.14,lxc4
    00:16:3e:6d:84:73,10.0.3.15,lxc5
    00:16:3e:43:90:c2,10.0.3.16,lxc6
    00:16:3e:78:c8:09,10.0.3.17,lxc7
    00:16:3e:75:6e:90,10.0.3.18,lxc8
    00:16:3e:d0:ac:db,10.0.3.19,lxc9

  3. Start a container

$ lxc start lxc0

  1. Check is ip is correct
    It is not.

    $ lxc list lxc0
    ±-----±--------±------------------±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±------------------±-----±-----------±----------+
    | lxc0 | RUNNING | 10.0.3.249 (eth0) | | PERSISTENT | 0 |
    ±-----±--------±------------------±-----±-----------±----------+

  2. Stop and start the container in the hope that it will take
    No luck

    $ lxc stop lxc0
    $ lxc start lxc0
    $ lxc list lxc0
    ±-----±--------±------------------±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±------------------±-----±-----------±----------+
    | lxc0 | RUNNING | 10.0.3.249 (eth0) | | PERSISTENT | 0 |
    ±-----±--------±------------------±-----±-----------±----------+

  3. Stop the container and issue command to config set, start the container
    and check if the change took effect
    No luck

    $ lxc stop lxc0
    $ lxc config device set lxc0 eth0 ipv4.address 10.0.3.10
    $ lxc start lxc0
    $ lxc list lxc0
    ±-----±--------±------------------±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±------------------±-----±-----------±----------+
    | lxc0 | RUNNING | 10.0.3.249 (eth0) | | PERSISTENT | 0 |
    ±-----±--------±------------------±-----±-----------±----------+

  4. Issue the command while the container is running and check if address changes
    No luck

    $ lxc list lxc0
    ±-----±--------±------------------±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±------------------±-----±-----------±----------+
    | lxc0 | RUNNING | 10.0.3.249 (eth0) | | PERSISTENT | 0 |
    ±-----±--------±------------------±-----±-----------±----------+

    $ lxc config device set lxc0 eth0 ipv4.address 10.0.3.10
    $ lxc list lxc0
    ±-----±--------±------------------±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±------------------±-----±-----------±----------+
    | lxc0 | RUNNING | 10.0.3.249 (eth0) | | PERSISTENT | 0 |
    ±-----±--------±------------------±-----±-----------±----------+

  5. Set network.dhcp false to check that container is getting ip address from the bridge
    I expect to get no ip address
    And result is as expected

    $ lxc network set lxdbr0 ipv4.dhcp false
    $ lxc stop lxc0
    $ lxc start lxc0
    $ lxc list lxc0
    ±-----±--------±-----±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±-----±-----±-----------±----------+
    | lxc0 | RUNNING | | | PERSISTENT | 0 |
    ±-----±--------±-----±-----±-----------±----------+

  6. Peform similar tests while ipv4.dhcp is false
    No luck

    $ lxc stop lxc0
    $ lxc start lxc0
    $ lxc list lxc0
    ±-----±--------±-----±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±-----±-----±-----------±----------+
    | lxc0 | RUNNING | | | PERSISTENT | 0 |
    ±-----±--------±-----±-----±-----------±----------+
    $ lxc config device set lxc0 eth0 ipv4.address 10.0.3.10
    $ lxc list lxc0
    ±-----±--------±-----±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±-----±-----±-----------±----------+
    | lxc0 | RUNNING | | | PERSISTENT | 0 |
    ±-----±--------±-----±-----±-----------±----------+
    $ lxc stop lxc0
    $ lxc start lxc0
    $ lxc list lxc0
    ±-----±--------±-----±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±-----±-----±-----------±----------+
    | lxc0 | RUNNING | | | PERSISTENT | 0 |
    ±-----±--------±-----±-----±-----------±----------+
    $ lxc stop lxc0
    $ lxc config device set lxc0 eth0 ipv4.address 10.0.3.10
    $ lxc start lxc0
    $ lxc list lxc0
    ±-----±--------±-----±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±-----±-----±-----------±----------+
    | lxc0 | RUNNING | | | PERSISTENT | 0 |
    ±-----±--------±-----±-----±-----------±----------+

  7. Change ipv4.dhcp back to true and retry earlier steps
    WOW!!!

    $ lxc network set lxdbr0 ipv4.dhcp true
    $ lxc list lxc0
    ±-----±--------±-----±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±-----±-----±-----------±----------+
    | lxc0 | RUNNING | | | PERSISTENT | 0 |
    ±-----±--------±-----±-----±-----------±----------+
    $ lxc stop lxc0
    $ lxc start lxc0
    $ lxc list lxc0
    ±-----±--------±-----------------±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±-----------------±-----±-----------±----------+
    | lxc0 | RUNNING | 10.0.3.10 (eth0) | | PERSISTENT | 0 |
    ±-----±--------±-----------------±-----±-----------±----------+

  8. Wonder if this behaviour persists?
    IT DOES. WOW!!!

    $ lxc stop lxc0
    $ lxc start lxc0
    $ lxc list lxc0
    ±-----±--------±-----------------±-----±-----------±----------+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    ±-----±--------±-----------------±-----±-----------±----------+
    | lxc0 | RUNNING | 10.0.3.10 (eth0) | | PERSISTENT | 0 |
    ±-----±--------±-----------------±-----±-----------±----------+

My first suggestion is to put any configurations and output in a Preformatted text environment. Ideally, enclose the text within triplets of backtick characters.

My second suggestion would be to set the configuration in a way that someone who has never used Ansible with LXD, can easily do so.

Thanks noted. Will give it a try next post. Have not used BB in years

The forum software is Discourse, and you compose the messages using markdown.

Okay, so at the start of this I had two problems:

  1. containers not using the ipv4 address specified to them
  2. cloud-init directives not beig executed on first launch

I have overcome both and want to post here so it may help somebody else.

Before I continue, thanks to simos in two ways:

  1. pointing out format issues using bbcode. I do appologise.
  2. For his article “How to preconfigure LXD containers with cloud-init”
    https://blog.simos.info/how-to-preconfigure-lxd-containers-with-cloud-init/

The latter gave me a good hint on where I was going wrong with problem 2 or at least how I could take a different approach to the second issue.
As I said before I am not sure if I am encountering strange behaviour or just not understanding as a result of my only working with LXD for 3 days.

For brevity in the post I have put the Ansible tasks in a pastebin
https://paste.ubuntu.com/p/JWFjcBwCrt/

The preseed.sh above remains the same
The config.yml above remains the same

The solution to the first problem:
Pastbin Lines 66 - 87

  1. Set the bridge ipv4.dhcp false
  2. start|stop the containers
  3. revert the bridge ipv4.dhcp true
  4. starting the containers again.

Strange and, while I am convinced this is not the way to do it, it works!!
If anyone can advise on the corrections I should make or if this is indeed deviant behavior please let me know.

The solution to the second problem:

Pastbin Lines 31 - 37
Don’t target the container user.user-data
Do target the profile user.user-data.

Hope this helps somebody else and would appreciate any feedback.

2 Likes