When running multiple containers on the same bridge, do you need to assign different NIC device names

pgoetz · May 1, 2019, 10:58pm

I thought the answer to this was obvious, but I recently added a second container with a static IP address to a host, and this new container appears to be having intermittent network issues.

Suppose I already have one container with networking assigned like this:

lxc network attach lxdbr0 my_container1 eth0
lxc config device set my_container1 eth0 ipv4.address 10.248.83.3

Is it OK to set up networking on a second container using the same device?

lxc network attach lxdbr0 my_container2 eth0
lxc config device set my_container2 eth0 ipv4.address 10.248.83.4

I assumed obviously yes, since the device is internal to the container, but then I’m having intermittent network issues and realized I don’t actually know what these commands do.

pgoetz · May 2, 2019, 9:43am

Following up on my own post. Running apt commands regularly stalls and times out. If I -c when this happens and re-run the command, it usually works. Something is definitely wrong with networking; I’m just not sure what.

gpatel-fr · May 2, 2019, 10:23am

do running ‘ip address’, ‘ip route’ on both containers provides any interesting insight ? if not you can always post the result if someone has an idea.

pgoetz · May 2, 2019, 10:34am

Here are the outputs. It looks very standard and sane to me:
Container1 is called archon:

root@archon:~# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
16: eth0@if17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 00:16:3e:14:09:ee brd ff:ff:ff:ff:ff:ff
    inet 10.248.83.3/24 brd 10.248.83.255 scope global eth0
       valid_lft forever preferred_lft forever
root@archon:~# ip route
default via 10.248.83.1 dev eth0  metric 100 
10.248.83.0/24 dev eth0  proto kernel  scope link  src 10.248.83.3

Container2 is called atom:

root@atom:~# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
40: eth0@if41: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:a5:a2:e6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.248.83.4/24 brd 10.248.83.255 scope global eth0
       valid_lft forever preferred_lft forever
root@atom:~# ip route
default via 10.248.83.1 dev eth0 
10.248.83.0/24 dev eth0  proto kernel  scope link  src 10.248.83.4

The interface numbering looks a little strange (i.e. is non-sequential); I wonder why that is, but am certain it’s not relevant.

gpatel-fr · May 2, 2019, 11:25am

this is the only obvious difference between your 2 containers, unfortunately Output format: for example knows nothing about this. I’d assume that is related to containers then (ns is probably for namespace) but can having one container without this mysterious namespace id (or something like that) give trouble to other more behaved containers?
That’s definitely not for me to say, it’s far beyond my ken

What you could try however if no higher help comes your way is to try to recreate your first container so that it gets a netnsid too. Maybe it could be your salvation.

Or possibly it’s another problem entirely with your general network (ipv4/ipv6 problem ? that can be debugged with iptables logs)

pgoetz · May 2, 2019, 11:44am

Here’s an example of what happens nearly every time:

root@atom:~# apt update
Hit:1 http://archive.ubuntu.com/ubuntu xenial InRelease
Get:2 http://security.ubuntu.com/ubuntu xenial-security InRelease [109 kB]
Get:3 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [109 kB]  
Get:4 http://archive.ubuntu.com/ubuntu xenial-backports InRelease [107 kB]    
Get:5 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages [947 kB]
Get:6 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages [747 kB]
0% [Connecting to nginx.org (95.211.80.227)] [Connecting to packages.elasticsearch.org(151.101.50.217)]

followed by

Err:7 http://packages.elasticsearch.org/elasticsearch/1.7/debian stable InRelease                       
  Could not connect to packages.elasticsearch.org:80 (151.101.50.217), connection timed out
Err:8 http://nginx.org/packages/mainline/ubuntu xenial InRelease                                       
 Could not connect to nginx.org:80 (62.210.92.35), connection timed out [IP: 62.210.92.35 80]

Then if I just re-run the command a couple of times it makes the connection and download the package lists. If I ping the respective hosts involved, the same thing happens.

gpatel-fr · May 2, 2019, 1:35pm

dos it always fails on nginx and elasticsearch packages and never with ubuntu ? the obvious diff between ubuntu and the others is that ubuntu has quite larger TTL.

pgoetz · May 3, 2019, 12:41am

It’s failed a couple of times with Ubuntu, too; but yes; consistently with the PPA’s.

gpatel-fr · May 3, 2019, 8:43am

I decided to bite the bullet and try it (I usually never deviate from default configuration unless I have a need for it)

What I did:
created a container
lxc launch 35f6bff57c25 -p dev
(35f6bff57c25 is an Ubuntu18.04LTS image, dev is a copy of the default profile with the same devices as the original)
-> created ‘advanced-mole’ container
lxc config edit advanced-mole

devices:
  ens1:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  eth0:
    type: none

I then used your command line
lxc config device set advanced-mole ens1 ipv4.address 10.10.0.241
it was accepted all right, i started the container and it got the wanted address.
Looking up how it’s done, there is one thing changed in the container dnsmasq config file, cat /var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts/advanced-mole:
00:16:3e:c8:5b:2a,10.10.0.241,advanced-mole
the rest of the bunch of containers have the MAC address and the container name but not the IP address.
Perusing the doc, it’s not said that setting an IP address is doing anything else unless one use some sort of thingy called MAAS.
Needless to say, I see no network problem.
Did you follow a similar routine to setup your containers ?

pgoetz · May 3, 2019, 9:40am

This is the recipe I follow to set up containers, and I’m willing to accept that my problem here is just intermittent poor network performance, perhaps due to an ISP upstream. I always want them with static IP addresses that are proxied to, since they all so far have involved web applications that only run properly on Ubuntu 12|14|16.04. Also note that my base OS is Arch Linux, so things are slightly different than they would be if I were using Ubuntu as my base; e.g. I have to install LXD from the Arch AUR. On init, I just accept all the defaults save for not configuring IPv6.

lxc image copy ubuntu:16.04 local: --alias ubuntu16
lxc init ubuntu16 my_container -c security.privileged=true
lxc network attach lxdbr0 my_container eth0
lxc config device set my_container eth0 ipv4.address 10.248.83.4

I probably should just learn to work directly with the YAML files the way you are doing.

gpatel-fr · May 3, 2019, 10:26am

I used Yaml only because I needed to inhibit inheritance of the profile for eth0 device. Otherwise I never bother with that. I’m curious of what you have for

lxc profile show <the-profile-for-your-containers>

pgoetz · May 3, 2019, 10:39am

How do I find the name of the profile for my container?

gpatel-fr · May 3, 2019, 10:49am

lxc config show advanced-mole
should get you a profiles: entry listing the profile(s) for your container

pgoetz · May 3, 2019, 11:06am

[pgoetz@erap-atx ~]$ lxc config show --expanded atom
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 16.04 LTS amd64 (release) (20190424)
  image.label: release
  image.os: ubuntu
  image.release: xenial
  image.serial: "20190424"
  image.version: "16.04"
  security.privileged: "true"
  volatile.base_image: f32f9de84a9e70b23f128f909f72ba484bc9ea70c69316ea5e32fb3c11282a34
  volatile.eth0.hwaddr: 00:16:3e:a5:a2:e6
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[]'
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
devices:
  eth0:
    ipv4.address: 10.248.83.4
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

According to this the profiles only live in the database, so I’m still not sure how to show what’s in the default profile.

gpatel-fr · May 3, 2019, 12:53pm

the lxd api is really quite regular and it’s usually easy to guess the right syntax (but for snapshots :-/)

lxc profile show default

pgoetz · May 3, 2019, 1:18pm

Yep, I love LXD and have been perfectly happy with it so far. The clean and logical syntax is the best part.

[pgoetz@erap-atx ~]$ lxc profile show default
config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
name: default
used_by:
- /1.0/containers/archon1404
- /1.0/containers/archon
- /1.0/containers/archon/archon_snap_20190316
- /1.0/containers/atom
- /1.0/containers/nginxtest

gpatel-fr · May 3, 2019, 1:59pm

I’m baffled by this, to be frank. It’s not as I understand the doc. You are inheriting a device from a profile and yet you are redefining it without blocking inheritance. This is troubling

pgoetz · May 3, 2019, 3:55pm

I’m not following what you’re concern is. How is it that I’m redefining the device, and what’s the evidence that inheritance isn’t being blocked?

Learning how these complex systems work is always an iterative process, at least for me.

gpatel-fr · May 3, 2019, 6:22pm

redefining the device = attributing a fixed IP address, by default LXD make containers obtain an address through DHCP

not blocking inheritance: from /doc/containers.md you can see that;

## Device types
LXD supports the following device types:

ID (database)   | Name                              | Description
:--             | :--                               | :--
0               | [none](#type-none)                | Inheritance blocker
1               | [nic](#type-nic)                  | Network interface
(...)
### Type: none
A none type device doesn't have any property and doesn't create anything inside the container.

It's only purpose it to stop inheritance of devices coming from profiles.

To do so, just add a none type device with the same name of the one you wish to skip inheriting.
It can be added in a profile being applied after the profile it originated from or directly on the container.

Let me seen it should be possible to access this on linuxcontainers.org, Well, I have to admit I’m not sure this has made it to man pages. If yes, it’s a pity. Still, it’s on github (of course, I access the docs on my disk by having git cloned the lxd repo)

github.com

lxc/lxd/blob/master/doc/containers.md

# Container configuration
## Properties
The following are direct container properties and can't be part of a profile:

 - `name`
 - `architecture`

Name is the container name and can only be changed by renaming the container.

Valid container names must:

 - Be between 1 and 63 characters long
 - Be made up exclusively of letters, numbers and dashes from the ASCII table
 - Not start with a digit or a dash
 - Not end with a dash

This requirement is so that the container name may properly be used in
DNS records, on the filesystem, in various security profiles as well as
the hostname of the container itself.

This file has been truncated. show original

pgoetz · May 8, 2019, 4:00pm

This has been an interesting discussion, but I think my issue was some kind of intermittent network problem upstream. I’ve added a couple of additional test containers and haven’t seen the same problem again.