Can't ping anything on the internet from inside containers anymore (since 3.9 I believe)

Hmm, nothing changed that would explain this, sounds like you’ve got a routing or firewalling issue going on here.

Can you show iptables -L -n -v and ip6tables -L -n -v?

iptables -L -n -v

Chain INPUT (policy ACCEPT 454K packets, 601M bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     tcp  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:53 /* generated for LXD network lxdbr0 */
   89  5792 ACCEPT     udp  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:53 /* generated for LXD network lxdbr0 */
   23  7544 ACCEPT     udp  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:67 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     udp  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:53
    0     0 ACCEPT     tcp  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:53
    0     0 ACCEPT     udp  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:67
    0     0 ACCEPT     tcp  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:67

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    8   672 DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    4   336 ACCEPT     all  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdbr0 */
    4   336 ACCEPT     all  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdbr0 */
    0     0 DOCKER-ISOLATION-STAGE-1  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 DOCKER     all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker_gwbridge !docker_gwbridge  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      virbr0  0.0.0.0/0            192.168.122.0/24     ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     all  --  virbr0 *       192.168.122.0/24     0.0.0.0/0           
    0     0 ACCEPT     all  --  virbr0 virbr0  0.0.0.0/0            0.0.0.0/0           
    0     0 REJECT     all  --  *      virbr0  0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable
    0     0 REJECT     all  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable
    0     0 DROP       all  --  docker_gwbridge docker_gwbridge  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 315K packets, 153M bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     tcp  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            tcp spt:53 /* generated for LXD network lxdbr0 */
   89  6076 ACCEPT     udp  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            udp spt:53 /* generated for LXD network lxdbr0 */
   21  7056 ACCEPT     udp  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            udp spt:67 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     udp  --  *      virbr0  0.0.0.0/0            0.0.0.0/0            udp dpt:68

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER-ISOLATION-STAGE-2  all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 DOCKER-ISOLATION-STAGE-2  all  --  docker_gwbridge !docker_gwbridge  0.0.0.0/0            0.0.0.0/0           
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0           
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    8   672 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0

this looks a bit suspiscous

    0     0 REJECT     all  --  *      virbr0  0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable
    0     0 REJECT     all  --  virbr0 *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable

ip6tables -L -n -v (which I am not knowingly using)

Chain INPUT (policy ACCEPT 72398 packets, 73M bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     tcp      lxdbr0 *       ::/0                 ::/0                 tcp dpt:53 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     udp      lxdbr0 *       ::/0                 ::/0                 udp dpt:53 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     udp      lxdbr0 *       ::/0                 ::/0                 udp dpt:547 /* generated for LXD network lxdbr0 */

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     all      *      lxdbr0  ::/0                 ::/0                 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     all      lxdbr0 *       ::/0                 ::/0                 /* generated for LXD network lxdbr0 */

Chain OUTPUT (policy ACCEPT 53366 packets, 7829K bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     tcp      *      lxdbr0  ::/0                 ::/0                 tcp spt:53 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     udp      *      lxdbr0  ::/0                 ::/0                 udp spt:53 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     udp      *      lxdbr0  ::/0                 ::/0                 udp spt:547 /* generated for LXD network lxdbr0 */

So I’m not actually noticing anything particularly wrong above, assuming your containers are all on lxdbr0 and not on virbr0 or docker0.

Can you also show iptables -t nat -L -n -v, maybe the problem is on the masquerading side?

Can you also show cat /proc/sys/net/ipv4/ip_forward for good measure?

thans stephane,

Chain PREROUTING (policy ACCEPT 30087 packets, 8308K bytes)
 pkts bytes target     prot opt in     out     source               destination         
   69 11390 DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 485 packets, 123K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 12768 packets, 817K bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 12763 packets, 816K bytes)
 pkts bytes target     prot opt in     out     source               destination         
    5   623 MASQUERADE  all  --  *      *       10.19.225.0/24      !10.19.225.0/24       /* generated for LXD network lxdbr0 */
    0     0 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0           
    1    76 MASQUERADE  all  --  *      !docker_gwbridge  172.18.0.0/16        0.0.0.0/0           
    2   160 RETURN     all  --  *      *       192.168.122.0/24     224.0.0.0/24        
    0     0 RETURN     all  --  *      *       192.168.122.0/24     255.255.255.255     
    0     0 MASQUERADE  tcp  --  *      *       192.168.122.0/24    !192.168.122.0/24     masq ports: 1024-65535
    0     0 MASQUERADE  udp  --  *      *       192.168.122.0/24    !192.168.122.0/24     masq ports: 1024-65535
    0     0 MASQUERADE  all  --  *      *       192.168.122.0/24    !192.168.122.0/24    

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0           
    0     0 RETURN     all  --  docker_gwbridge *       0.0.0.0/0            0.0.0.0/0

furthermore …

@ debian  ~
└─ $ ▶ sudo cat /proc/sys/net/ipv4/ip_forward
1

all this does not really speak to me. And all containers are on the default profile and therfore go via the lxdbr0 interface

I can ping 8.8.8.8, 8.8.4.4 from the containers, so it smells to me like a DNS problem of some sort.

EDIT: also I have noted that launching a new container (from an image I do not have locally) fails with some sort of connection problem

lxc launch ubuntu:18.04 ubn1804
Creating ubn1804
Error: Failed container creation: Get https://cloud-images.ubuntu.com/releases/streams/v1/index.json: lookup cloud-images.ubuntu.com on [::1]:53: read udp [::1]:48690->[::1]:53: read: connection refused

I can ping cloud-images.ubuntu.com from my host though.

did a sudo snap disable lxd & sudo snap enable lxd (which I had tried before without any effect) and all of a sudden the network problem seems to have disappeared.

thanks @stgraber for the efforts and sorry for the interruption.

today, after a restart of my host I see the same problem from withing containers.

snap disable lxd  
snap enable lxd

resolves this issue once again.

I think this may be related to the issue I reported recently : https://github.com/lxc/lxd-pkg-snap/issues/32

just wanted to update that this problem now is constant.

when my host (debian9 laptop) starts, I reach any internet from within any container (lxd and containers where autostarted on system boot).
I get reliably over this with a sudo snap disable lxd && sudo snap enable lxd command. I don’t believe this is as it is supposed to work.

So, I had this problem running LXD 3.9 on Arch Linux with an Ubuntu 14.04 container and reported in another ticket in General (Automatic name resolution within an Ubuntu 14.04 container?). The solution for this particular container was to provide it with an appropriate DNS server IP (in my case, 192.168.1.1):

$ lxc exec my_container bash
- Edit /etc/resolvconf/resolv.conf.d/base:

 nameserver  192.168.1.1

# resolvconf -u

What I couldn’t and still don’t understand is why this works automatically for an Ubuntu 16.04 host / Ubuntu 14.04 container, but not when the host machine is Arch Linux.

For this kind of problem the first thing to do is check if dnsmasq is launched, if not look for syslog error messages to see why it fails to start, if yes check if it’s bound to the proper interface (by default lxdbr0)

I had this exact problem using ubuntu 18.04 I upgraded to 18.10 and everything worked again.

thanks for sharing. In my case this is debian9 so there isn’t an option to upgrade to ubuntu 18.10. Neither do I believe everybody running an LTS ubuntu production server would want to upgrade to 18.10 necessarily.

So I still hope for a solution to come up. Till then I’ll live with my workaround (sudo snap disable lxd && sudo snap enable lxd)

is there any difference in the output of
sudo ss -tapnu | grep LISTEN | grep 53
when it works and when it does not work ?

on an empty ubuntu:18.04 container, no

working

lxc exec ubn1804 -- sudo ss -tapnu | grep LISTEN | grep 53
tcp  LISTEN   0      128     127.0.0.53%lo:53          0.0.0.0:*       users:(("systemd-resolve",pid=112,fd=13))

not working

lxc exec ubn1804 -- sudo ss -tapnu | grep LISTEN | grep 53
tcp  LISTEN   0      128     127.0.0.53%lo:53          0.0.0.0:*       users:(("systemd-resolve",pid=112,fd=13))

err, it was about launching ss on the host, not in a container, I was not precise enough.

on the debian9 host

1. while network inside container not working

(‘unknown host’, or ‘Temporary failure in name resolution’

$ ▶ sudo ss -tapnu | grep LISTEN | grep 53
[sudo] password for manolo: 
tcp    LISTEN     0      5      10.19.225.1:53                    *:*                   users:(("dnsmasq",pid=2709,fd=9))
tcp    LISTEN     0      5      192.168.122.1:53                    *:*                   users:(("dnsmasq",pid=2405,fd=6))
tcp    LISTEN     0      5         fd42:2122:86a:2b30::1:53                   :::*                   users:(("dnsmasq",pid=2709,fd=13))
tcp    LISTEN     0      5      fe80::74e3:bff:fee3:1197%lxdbr0:53                   :::*                   users:(("dnsmasq",pid=2709,fd=11))

2. After “sudo snap disable lxd && sudo snap enable lxd”

(so, ping is working inside containers)

─ $ ▶ sudo ss -tapnu | grep LISTEN | grep 53
tcp    LISTEN     0      5      10.19.225.1:53                    *:*                   users:(("dnsmasq",pid=9148,fd=9))
tcp    LISTEN     0      5      192.168.122.1:53                    *:*                   users:(("dnsmasq",pid=2405,fd=6))
tcp    LISTEN     0      128      :::80                   :::*                   users:(("apache2",pid=7356,fd=4),("apache2",pid=7355,fd=4),("apache2",pid=7354,fd=4),("apache2",pid=7353,fd=4),("apache2",pid=7352,fd=4),("apache2",pid=1625,fd=4))
tcp    LISTEN     0      5         fd42:2122:86a:2b30::1:53                   :::*                   users:(("dnsmasq",pid=9148,fd=13))
tcp    LISTEN     0      5      fe80::2ca4:aff:fe64:e46%lxdbr0:53                   :::*                   users:(("dnsmasq",pid=9148,fd=11))

well, nothing to see here. All seems normal when you have this problem.
What about
systemd-resolve --status in the container when you have the problem ?
and while I am at it,
ip addr
and
dig @10.19.225.1 ubuntu.com
the same, in the container with the problem.

Global
          DNSSEC NTA: 10.in-addr.arpa
                      16.172.in-addr.arpa
                      168.192.in-addr.arpa
                      17.172.in-addr.arpa
                      18.172.in-addr.arpa
                      19.172.in-addr.arpa
                      20.172.in-addr.arpa
                      21.172.in-addr.arpa
                      22.172.in-addr.arpa
                      23.172.in-addr.arpa
                      24.172.in-addr.arpa
                      25.172.in-addr.arpa
                      26.172.in-addr.arpa
                      27.172.in-addr.arpa
                      28.172.in-addr.arpa
                      29.172.in-addr.arpa
                      30.172.in-addr.arpa
                      31.172.in-addr.arpa
                      corp
                      d.f.ip6.arpa
                      home
                      internal
                      intranet
                      lan
                      local
                      private
                      test

Link 21 (eth0)
      Current Scopes: DNS
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no
         DNS Servers: 10.19.225.1
                      fd42:2122:86a:2b30::1
                      fe80::a49d:b2ff:fe77:45aa
          DNS Domain: lxd

does not tell me much to be honest

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
21: eth0@if22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:40:2c:9c brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.19.225.31/24 brd 10.19.225.255 scope global dynamic eth0
       valid_lft 2928sec preferred_lft 2928sec
    inet6 fd42:2122:86a:2b30:216:3eff:fe40:2c9c/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 3545sec preferred_lft 3545sec
    inet6 fe80::216:3eff:fe40:2c9c/64 scope link 
       valid_lft forever preferred_lft forever

; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.19.225.1 ubuntu.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 61342
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;ubuntu.com.			IN	A

;; Query time: 0 msec
;; SERVER: 10.19.225.1#53(10.19.225.1)
;; WHEN: Mon Mar 11 07:46:56 UTC 2019
;; MSG SIZE  rcvd: 28

what does all this says ?
The first result says that the container dns resolution is fine, as it should be.
The second could have been interesting if the first one had been negative, but it don’t bring much in this case.
The third says that the container resolver, the specialized dnsmasq instance running on the host, is replying to the container indeed, but it is replying it to get lost (status: REFUSED). Obviously there is something wrong with dnsmasq.

why is dnsmasq so difficult, I have no idea. It’s not a common problem seen often on the internet.

I see 2 ways of going forward:

on the host, ps aux | grep dnsmasq
(when the problem happens)
Maybe this will show something obvious.

Or turning logging on; this can be done by editing network (lxc network edit lxdbr0) and adding a raw.dnsmasq key to define an additional config file, such as:

config:
(…)
raw.dnsmasq: conf-file=/media/root/rawdnsmasq-lxd

you have to create the file and add dnsmasq directives like that

log-queries
log-async
log-facility=/var/log/dnsmasq-lxd.log

I am setting this file under /media/root because you can’t use anything under /etc with snap lxd, it’s the first place that I have found escaping the evil claws of snap.