Is there a way to install LXD in centos7 without snap?

lxd

(OH SECHUN) #1

I first installed LXD 2.2.1 on 3 - centOS7 nodes with snap.

there were 4 containers respectively on 3 nodes,
one node using kernel ( version : 4.14.15-1.el7.elrepo.x86_64 )
and two nodes using kernel (version : 4.15.0-1.el7.elrepo.x86_64 )

yesterday snap refresh and updated LXD to version 3.0.

after updated, command “sudo lxc list” does not show any containers that created at 2.2.1 version on one node(4.14.15-1.el7.elrepo.x86_64). But container data not removed.

On other two nodes, command “sudo lxc list” shows containers but can not execute on containers.

I initiated LXD with default values except storage backend (changed to dir)
and I used Docker on LXD, so I set the container’s config with

linux.kernel_modules: bridge,br_netfilter,ip_tables,ip6_tables,netlink_diag,nf_nat,overlay,xt_conntrack
raw.lxc: |-
lxc.aa_profile = unconfined
lxc.cgroup.devices.allow = a
lxc.mount.auto=proc:rw sys:rw
lxc.cap.drop =
security.privileged: true
security.nesting: true

I thought that this is because of changes in config key (like aa_profile => apparmor_profile).
right?

In conclusion, I dont want unsuspected crash on my container environment.
So is there a way to install LXD with specific version on centos7 without snap?
(could not find except on Ubuntu)
Or can I disable auto snap refresh(client side)?
(except iptable control, just snap option or another)


(Stéphane Graber) #2

yes, the reason why your container wouldn’t start is because lxc.aa_profile should be lxc.apparmor.profile instead now. Not that this would matter on CentOS as apparmor isn’t a thing there, so not setting that key at all should work just as well.

Now that 3.0 is out, I’d recommend you switch to the 3.0 channel, which will then keep you on LXD 3.0 and only get you the bugfix updates for it. You can do so with snap refresh lxd --channel=3.0/stable.


(Stéphane Graber) #3

The node that doesn’t show any container running is odd. Could you e-mail me (stgraber at ubuntu dot com), the following files:

  • /var/snap/lxd/common/lxd/lxd.db
  • /var/snap/lxd/common/lxd/lxd.db.bak
  • /var/snap/lxd/common/lxd/logs/lxd.log
  • (output of) journalctl -u snap.lxd.daemon

I suspect it’s the DB conversion to raft having gone bad, but it’s not something we’ve seen outside of the early betas, so it’d be good to understand what happened and if that was in fact the issue, it should be pretty simple to revert to the database backup and upgrade again, hopefully successfully this time.


(OH SECHUN) #4

As sudden crash, I reinstalled lxd immediately(snap remove and snap install) so cannot send that files, sorry.

I tried to reproduce the crash on dev environment

  1. snap install lxd --channel=2.0
  2. create container with above configs
  3. snap refresh lxd --channel=3.0
  4. lxc list
    I cannot see the crash(did not show container list).

But I cannot execute my container.
$ lxc exec my-container – /bin/bash
-> Error: Error opening startup config file: "loading config file for the container failed"
Error: EOF

I tried to edit container’s config
$ lxc config edit my-container
lxc.aa_profile = unconfined => lxc.apparmor.profile = unconfined

then I cannot save config with below msg.

Config parsing error: Update will cause the container to rely on a profile’s root disk device but none was found.
Press enter to start the editor again

and I cannot create new container with failure
$ sudo lxc launch images:centos/7/amd64 my-container2 -c security.privileged=true -c security.nesting=true
=> Error: Failed container creation: No root device could be found.

Now, I’m using lxd 3.0 as you said.( “snap install lxd --channel=3.0/stable” )
but I don’t know why it does not work properly.


(Stéphane Graber) #5

Sounds like you’re missing a root disk in your default profile. That’s normally setup for you when you first run lxd init.

What do the following show?

  • lxc profile show default
  • lxc storage list
  • lxc network list

(OH SECHUN) #6

Oh, I’m sorry.
I didn’t know I have to init lxd again after refreshing.
but after init (below step), I cannot access my container too.

$ sudo lxd init
Would you like to use LXD clustering? (yes/no) [default=no]:
Do you want to configure a new storage pool? (yes/no) [default=yes]: no
Would you like to connect to a MAAS server? (yes/no) [default=no]:
Would you like to create a new network bridge? (yes/no) [default=yes]: no
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: yes
Name of the existing bridge or host interface: lxdbr0
Is this interface connected to your MAAS server? (yes/no) [default=yes]: no
Would you like LXD to be available over the network? (yes/no) [default=no]:
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] no
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]:
$ sudo lxc exec my-container -- /bin/bash
Error: Error opening startup config file: "loading config file for the container failed"
Error: EOF

My container’s config is below.

$ sudo lxc config show my-container
architecture: x86_64
config:
  security.nesting: "true"
  security.privileged: "true"
  volatile.base_image: 42417946b8bb0ee9b9b9b048d43d72067629d3e61c411503576604a1ff62d2f5
  volatile.eth0.hwaddr: 00:16:3e:da:dd:00
  volatile.idmap.base: "0"
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
devices:
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

From here, Information that you may want to knnow.

$ lxc profile show default
config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
name: default
used_by:
- /1.0/containers/my-container
$ sudo lxc storage list
+---------+-------------+--------+------------------------------------------------+---------+
|  NAME   | DESCRIPTION | DRIVER |                     SOURCE                     | USED BY |
+---------+-------------+--------+------------------------------------------------+---------+
| default |             | dir    | /var/snap/lxd/common/lxd/storage-pools/default | 2       |
+---------+-------------+--------+------------------------------------------------+---------+
$ sudo lxc network list
+---------+----------+---------+-------------+---------+
|  NAME   |   TYPE   | MANAGED | DESCRIPTION | USED BY |
+---------+----------+---------+-------------+---------+
| bond0   | bond     | NO      |             | 0       |
+---------+----------+---------+-------------+---------+
| docker0 | bridge   | NO      |             | 0       |
+---------+----------+---------+-------------+---------+
| eth0    | physical | NO      |             | 0       |
+---------+----------+---------+-------------+---------+
| eth1    | physical | NO      |             | 0       |
+---------+----------+---------+-------------+---------+
| eth2    | physical | NO      |             | 0       |
+---------+----------+---------+-------------+---------+
| eth3    | physical | NO      |             | 0       |
+---------+----------+---------+-------------+---------+
| lxdbr0  | bridge   | NO      |             | 1       |
+---------+----------+---------+-------------+---------+

please tell me what is wrong


(Stéphane Graber) #7

Looks like some of the 2.0 to 3.0 migration didn’t work too well… To fix things, I suspect you’ll want to run:

lxc profile device add default root disk path=/ pool=default
ip link del lxdbr0
lxc network create lxdbr0

I’d also recommend a reboot of the system at that point to make sure everything is clean.
Launching new containers should then work fine, and hopefully interacting with existing ones will be fixed too (though I don’t see a good reason for that exec error you’re getting…).

If the exec error continues, please post the output of snap list and snap changes as well as lxc info --show-log NAME that should help.


(OH SECHUN) #8
$ sudo lxc profile device add default root disk path=/ pool=default
Error: The device already exists

It looks like default device is already exists.

$ sudo snap list
Name  Version    Rev   Developer  Notes
core  16-2.31.2  4206  canonical  core
lxd   3.0.0      6649  canonical  -

and

$ sudo lxc info --show-log my-container
Name: my-container
Remote: unix://
Architecture: x86_64
Created: 2018/04/09 02:37 UTC
Status: Running
Type: persistent
Profiles: default
Pid: 63145
Ips:
  lo:   inet    127.0.0.1
  lo:   inet6   ::1
  eth0: inet6   fe80::216:3eff:feda:dd00
Resources:
  Processes: 9
  CPU usage:
    CPU usage (in seconds): 3
  Memory usage:
    Memory (current): 36.39MB
    Memory (peak): 48.27MB
  Network usage:
    eth0:
      Bytes received: 470.83kB
      Bytes sent: 28.03kB
      Packets received: 3512
      Packets sent: 367
    lo:
      Bytes received: 35.81kB
      Bytes sent: 35.81kB
      Packets received: 103
      Packets sent: 103

Log:

      lxc 20180409023759.250 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
      lxc 20180409023759.250 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
      lxc 20180409023759.359 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
      lxc 20180409023759.359 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.

so, consequently I cannot Live-Migrate my containers with version 3.0.