Hello there,
so the other day I started to set up a new homeserver for myself and thought about using LXD/LXC to manage my services (as I already do on my webhost) and to experiment with things without messing up the host installation.
Unfortunately, I am now at a point where i’m honestly thinking about reverting the whole machine back to a clean install as I can’t figure out how to use a manually configured network-bridge with LXD, which should have the same functionality as the one LXD uses and manages, but with dnsmasq running on the host. I need this setup, because I need dnsmasq to also serve my local network (DHCP and local DNS server). Yes, I read that you could run both your own and the lxc-bridge-dnsmasq in parallel, but it was nowhere to be found how much that would complicate things.
What I did was that I removed netplan (following this procedure) and reverted back to /etc/network/interfaces as I wanted to create a bridge without attached hardware-interface, which is afaik not possible with netplan. I tested various things, followed a dozen or so tutorials and guides, but nothing was really able to give me a hint on how to do what I want. I want the LXD-Containers in their own, private network. For this to work, dnsmasq needs seperate interfaces to operate the DHCP server on. So I wanted something simillar to the bridge that LXD creates. In the interfaces file I then created a bridge with bridge_ports none
- that’s the “no attached hardware-interface”-part of the story. This works, concerning the hosts network interfaces, and my whole configuration is something like this:
auto lo
iface lo inet loopback
auto enp3s0
iface enp3s0 inet static
address 192.168.178.103
netmask 255.255.255.0
broadcast 192.168.178.255
gateway 192.168.178.1
dns-nameservers 192.168.178.1 1.1.1.1
# lxd bridge
auto bridge0
iface bridge0 inet static
bridge_ports none
bridge_fd 0
bridge_maxwait 0
bridge_stp no
address 10.254.116.1
netmask 255.255.255.0
up iptables -t nat -A POSTROUTING -o enp3s0 -j MASQUERADE
After configuring the interfaces, rebooting and all that, lxc network list
shows the bridge:
+---------+----------+---------+-------------+---------+
| NAME | TYPE | MANAGED | DESCRIPTION | USED BY |
+---------+----------+---------+-------------+---------+
| bridge0 | bridge | NO | | 3 |
+---------+----------+---------+-------------+---------+
| enp3s0 | physical | NO | | 0 |
+---------+----------+---------+-------------+---------+
So I added it to the default profile and the containers, thinking that i configure DHCP later (I also need to find out how to make the host-dnsmasq recognize the ipv4.address
-part of the container config). But then, desaster. All of my containers with this bridge attached won’t start up anymore. For example, the config for a test-container is as follows:
architecture: x86_64
config:
image.architecture: amd64
image.description: ubuntu 18.04 LTS amd64 (release) (20200317)
image.label: release
image.os: ubuntu
image.release: bionic
image.serial: "20200317"
image.type: squashfs
image.version: "18.04"
volatile.base_image: 98e43d99d83ef1e4d0b28a31fc98e01dd98a2dbace3870e51c5cb03ce908144b
volatile.bridge0.hwaddr: 00:16:3e:a8:61:b3
volatile.bridge0.name: eth0
volatile.eth0.hwaddr: 00:16:3e:29:7a:97
volatile.idmap.base: "0"
volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.power: STOPPED
devices:
eth0:
name: eth0
nictype: bridged
parent: bridge0
type: nic
ephemeral: false
profiles:
- default
stateful: false
description: ""
before i attached bridge0 on eth0, it was able to start. Now if I try lxc start test
it gives me an error, if I open the log i get this:
Name: test
Location: none
Remote: unix://
Architecture: x86_64
Created: 2020/04/08 00:23 UTC
Status: Stopped
Type: container
Profiles: default
Log:
lxc test 20200408162411.148 ERROR cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1143 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.monitor.test"
lxc test 20200408162411.148 ERROR cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1143 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.payload.test"
lxc test 20200408162411.149 ERROR utils - utils.c:lxc_can_use_pidfd:1834 - Kernel does not support pidfds
lxc test 20200408162411.231 ERROR network - network.c:lxc_network_move_created_netdev_priv:3129 - File exists - Failed to move network device "veth79e2d470" with ifindex 6 to network namespace 6911
lxc test 20200408162411.231 ERROR start - start.c:lxc_spawn:1751 - Failed to create the network
lxc test 20200408162411.234 WARN network - network.c:lxc_delete_network_priv:3213 - Failed to rename interface with index 0 from "eth0" to its initial name "vethfa99e2b8"
lxc test 20200408162411.237 WARN network - network.c:lxc_delete_network_priv:3213 - Failed to rename interface with index 0 from "eth0" to its initial name "veth79e2d470"
lxc test 20200408162411.237 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:852 - Received container state "ABORTING" instead of "RUNNING"
lxc test 20200408162411.238 ERROR start - start.c:__lxc_start:1953 - Failed to spawn container "test"
lxc test 20200408162411.238 WARN start - start.c:lxc_abort:1030 - No such process - Failed to send SIGKILL to 6911
lxc 20200408162411.613 WARN commands - commands.c:lxc_cmd_rsp_recv:122 - Connection reset by peer - Failed to receive response for command "get_state"
This happens to every container as soon as I attach the bridge. If I detach/remove it, then the container is able to start again.
I searched here and in the git repos for the network-related erros, but I really could not find anything usefull regarding these. I’m also wondering why it’s listing eth0 two times after the failed move. I have a feeling that I did something horribly wrong regarding the bridge, but I have absolutely no clue how to fix this error, get my containers starting again and keep going.