Hmm, nothing changed that would explain this, sounds like you’ve got a routing or firewalling issue going on here.
Can you show iptables -L -n -v
and ip6tables -L -n -v
?
Hmm, nothing changed that would explain this, sounds like you’ve got a routing or firewalling issue going on here.
Can you show iptables -L -n -v
and ip6tables -L -n -v
?
iptables -L -n -v
Chain INPUT (policy ACCEPT 454K packets, 601M bytes)
pkts bytes target prot opt in out source destination
0 0 ACCEPT tcp -- lxdbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 /* generated for LXD network lxdbr0 */
89 5792 ACCEPT udp -- lxdbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 /* generated for LXD network lxdbr0 */
23 7544 ACCEPT udp -- lxdbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 /* generated for LXD network lxdbr0 */
0 0 ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53
0 0 ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53
0 0 ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67
0 0 ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
8 672 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0
4 336 ACCEPT all -- * lxdbr0 0.0.0.0/0 0.0.0.0/0 /* generated for LXD network lxdbr0 */
4 336 ACCEPT all -- lxdbr0 * 0.0.0.0/0 0.0.0.0/0 /* generated for LXD network lxdbr0 */
0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * docker_gwbridge 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
0 0 DOCKER all -- * docker_gwbridge 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker_gwbridge !docker_gwbridge 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * virbr0 0.0.0.0/0 192.168.122.0/24 ctstate RELATED,ESTABLISHED
0 0 ACCEPT all -- virbr0 * 192.168.122.0/24 0.0.0.0/0
0 0 ACCEPT all -- virbr0 virbr0 0.0.0.0/0 0.0.0.0/0
0 0 REJECT all -- * virbr0 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
0 0 REJECT all -- virbr0 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
0 0 DROP all -- docker_gwbridge docker_gwbridge 0.0.0.0/0 0.0.0.0/0
Chain OUTPUT (policy ACCEPT 315K packets, 153M bytes)
pkts bytes target prot opt in out source destination
0 0 ACCEPT tcp -- * lxdbr0 0.0.0.0/0 0.0.0.0/0 tcp spt:53 /* generated for LXD network lxdbr0 */
89 6076 ACCEPT udp -- * lxdbr0 0.0.0.0/0 0.0.0.0/0 udp spt:53 /* generated for LXD network lxdbr0 */
21 7056 ACCEPT udp -- * lxdbr0 0.0.0.0/0 0.0.0.0/0 udp spt:67 /* generated for LXD network lxdbr0 */
0 0 ACCEPT udp -- * virbr0 0.0.0.0/0 0.0.0.0/0 udp dpt:68
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
pkts bytes target prot opt in out source destination
0 0 DOCKER-ISOLATION-STAGE-2 all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
0 0 DOCKER-ISOLATION-STAGE-2 all -- docker_gwbridge !docker_gwbridge 0.0.0.0/0 0.0.0.0/0
0 0 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-ISOLATION-STAGE-2 (2 references)
pkts bytes target prot opt in out source destination
0 0 DROP all -- * docker0 0.0.0.0/0 0.0.0.0/0
0 0 DROP all -- * docker_gwbridge 0.0.0.0/0 0.0.0.0/0
0 0 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-USER (1 references)
pkts bytes target prot opt in out source destination
8 672 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0
this looks a bit suspiscous
0 0 REJECT all -- * virbr0 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
0 0 REJECT all -- virbr0 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
ip6tables -L -n -v
(which I am not knowingly using)
Chain INPUT (policy ACCEPT 72398 packets, 73M bytes)
pkts bytes target prot opt in out source destination
0 0 ACCEPT tcp lxdbr0 * ::/0 ::/0 tcp dpt:53 /* generated for LXD network lxdbr0 */
0 0 ACCEPT udp lxdbr0 * ::/0 ::/0 udp dpt:53 /* generated for LXD network lxdbr0 */
0 0 ACCEPT udp lxdbr0 * ::/0 ::/0 udp dpt:547 /* generated for LXD network lxdbr0 */
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 ACCEPT all * lxdbr0 ::/0 ::/0 /* generated for LXD network lxdbr0 */
0 0 ACCEPT all lxdbr0 * ::/0 ::/0 /* generated for LXD network lxdbr0 */
Chain OUTPUT (policy ACCEPT 53366 packets, 7829K bytes)
pkts bytes target prot opt in out source destination
0 0 ACCEPT tcp * lxdbr0 ::/0 ::/0 tcp spt:53 /* generated for LXD network lxdbr0 */
0 0 ACCEPT udp * lxdbr0 ::/0 ::/0 udp spt:53 /* generated for LXD network lxdbr0 */
0 0 ACCEPT udp * lxdbr0 ::/0 ::/0 udp spt:547 /* generated for LXD network lxdbr0 */
So I’m not actually noticing anything particularly wrong above, assuming your containers are all on lxdbr0 and not on virbr0 or docker0.
Can you also show iptables -t nat -L -n -v
, maybe the problem is on the masquerading side?
Can you also show cat /proc/sys/net/ipv4/ip_forward
for good measure?
thans stephane,
Chain PREROUTING (policy ACCEPT 30087 packets, 8308K bytes)
pkts bytes target prot opt in out source destination
69 11390 DOCKER all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL
Chain INPUT (policy ACCEPT 485 packets, 123K bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 12768 packets, 817K bytes)
pkts bytes target prot opt in out source destination
0 0 DOCKER all -- * * 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL
Chain POSTROUTING (policy ACCEPT 12763 packets, 816K bytes)
pkts bytes target prot opt in out source destination
5 623 MASQUERADE all -- * * 10.19.225.0/24 !10.19.225.0/24 /* generated for LXD network lxdbr0 */
0 0 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0
1 76 MASQUERADE all -- * !docker_gwbridge 172.18.0.0/16 0.0.0.0/0
2 160 RETURN all -- * * 192.168.122.0/24 224.0.0.0/24
0 0 RETURN all -- * * 192.168.122.0/24 255.255.255.255
0 0 MASQUERADE tcp -- * * 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535
0 0 MASQUERADE udp -- * * 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535
0 0 MASQUERADE all -- * * 192.168.122.0/24 !192.168.122.0/24
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
0 0 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0
0 0 RETURN all -- docker_gwbridge * 0.0.0.0/0 0.0.0.0/0
furthermore …
@ debian ~
└─ $ ▶ sudo cat /proc/sys/net/ipv4/ip_forward
1
all this does not really speak to me. And all containers are on the default profile
and therfore go via the lxdbr0
interface
I can ping 8.8.8.8, 8.8.4.4 from the containers, so it smells to me like a DNS problem of some sort.
EDIT: also I have noted that launching a new container (from an image I do not have locally) fails with some sort of connection problem
lxc launch ubuntu:18.04 ubn1804
Creating ubn1804
Error: Failed container creation: Get https://cloud-images.ubuntu.com/releases/streams/v1/index.json: lookup cloud-images.ubuntu.com on [::1]:53: read udp [::1]:48690->[::1]:53: read: connection refused
I can ping cloud-images.ubuntu.com from my host though.
did a sudo snap disable lxd
& sudo snap enable lxd
(which I had tried before without any effect) and all of a sudden the network problem seems to have disappeared.
thanks @stgraber for the efforts and sorry for the interruption.
today, after a restart of my host I see the same problem from withing containers.
snap disable lxd
snap enable lxd
resolves this issue once again.
I think this may be related to the issue I reported recently : https://github.com/lxc/lxd-pkg-snap/issues/32
just wanted to update that this problem now is constant.
when my host (debian9 laptop) starts, I reach any internet from within any container (lxd and containers where autostarted on system boot).
I get reliably over this with a sudo snap disable lxd && sudo snap enable lxd
command. I don’t believe this is as it is supposed to work.
So, I had this problem running LXD 3.9 on Arch Linux with an Ubuntu 14.04 container and reported in another ticket in General (Automatic name resolution within an Ubuntu 14.04 container?). The solution for this particular container was to provide it with an appropriate DNS server IP (in my case, 192.168.1.1):
$ lxc exec my_container bash
- Edit /etc/resolvconf/resolv.conf.d/base:
nameserver 192.168.1.1
# resolvconf -u
What I couldn’t and still don’t understand is why this works automatically for an Ubuntu 16.04 host / Ubuntu 14.04 container, but not when the host machine is Arch Linux.
For this kind of problem the first thing to do is check if dnsmasq is launched, if not look for syslog error messages to see why it fails to start, if yes check if it’s bound to the proper interface (by default lxdbr0)
I had this exact problem using ubuntu 18.04 I upgraded to 18.10 and everything worked again.
thanks for sharing. In my case this is debian9 so there isn’t an option to upgrade to ubuntu 18.10. Neither do I believe everybody running an LTS ubuntu production server would want to upgrade to 18.10 necessarily.
So I still hope for a solution to come up. Till then I’ll live with my workaround (sudo snap disable lxd && sudo snap enable lxd
)
is there any difference in the output of
sudo ss -tapnu | grep LISTEN | grep 53
when it works and when it does not work ?
on an empty ubuntu:18.04 container, no
working
lxc exec ubn1804 -- sudo ss -tapnu | grep LISTEN | grep 53
tcp LISTEN 0 128 127.0.0.53%lo:53 0.0.0.0:* users:(("systemd-resolve",pid=112,fd=13))
not working
lxc exec ubn1804 -- sudo ss -tapnu | grep LISTEN | grep 53
tcp LISTEN 0 128 127.0.0.53%lo:53 0.0.0.0:* users:(("systemd-resolve",pid=112,fd=13))
err, it was about launching ss on the host, not in a container, I was not precise enough.
on the debian9 host
(‘unknown host’, or ‘Temporary failure in name resolution’
$ ▶ sudo ss -tapnu | grep LISTEN | grep 53
[sudo] password for manolo:
tcp LISTEN 0 5 10.19.225.1:53 *:* users:(("dnsmasq",pid=2709,fd=9))
tcp LISTEN 0 5 192.168.122.1:53 *:* users:(("dnsmasq",pid=2405,fd=6))
tcp LISTEN 0 5 fd42:2122:86a:2b30::1:53 :::* users:(("dnsmasq",pid=2709,fd=13))
tcp LISTEN 0 5 fe80::74e3:bff:fee3:1197%lxdbr0:53 :::* users:(("dnsmasq",pid=2709,fd=11))
(so, ping is working inside containers)
─ $ ▶ sudo ss -tapnu | grep LISTEN | grep 53
tcp LISTEN 0 5 10.19.225.1:53 *:* users:(("dnsmasq",pid=9148,fd=9))
tcp LISTEN 0 5 192.168.122.1:53 *:* users:(("dnsmasq",pid=2405,fd=6))
tcp LISTEN 0 128 :::80 :::* users:(("apache2",pid=7356,fd=4),("apache2",pid=7355,fd=4),("apache2",pid=7354,fd=4),("apache2",pid=7353,fd=4),("apache2",pid=7352,fd=4),("apache2",pid=1625,fd=4))
tcp LISTEN 0 5 fd42:2122:86a:2b30::1:53 :::* users:(("dnsmasq",pid=9148,fd=13))
tcp LISTEN 0 5 fe80::2ca4:aff:fe64:e46%lxdbr0:53 :::* users:(("dnsmasq",pid=9148,fd=11))
well, nothing to see here. All seems normal when you have this problem.
What about
systemd-resolve --status in the container when you have the problem ?
and while I am at it,
ip addr
and
dig @10.19.225.1 ubuntu.com
the same, in the container with the problem.
Global
DNSSEC NTA: 10.in-addr.arpa
16.172.in-addr.arpa
168.192.in-addr.arpa
17.172.in-addr.arpa
18.172.in-addr.arpa
19.172.in-addr.arpa
20.172.in-addr.arpa
21.172.in-addr.arpa
22.172.in-addr.arpa
23.172.in-addr.arpa
24.172.in-addr.arpa
25.172.in-addr.arpa
26.172.in-addr.arpa
27.172.in-addr.arpa
28.172.in-addr.arpa
29.172.in-addr.arpa
30.172.in-addr.arpa
31.172.in-addr.arpa
corp
d.f.ip6.arpa
home
internal
intranet
lan
local
private
test
Link 21 (eth0)
Current Scopes: DNS
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
DNS Servers: 10.19.225.1
fd42:2122:86a:2b30::1
fe80::a49d:b2ff:fe77:45aa
DNS Domain: lxd
does not tell me much to be honest
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
21: eth0@if22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:16:3e:40:2c:9c brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.19.225.31/24 brd 10.19.225.255 scope global dynamic eth0
valid_lft 2928sec preferred_lft 2928sec
inet6 fd42:2122:86a:2b30:216:3eff:fe40:2c9c/64 scope global dynamic mngtmpaddr noprefixroute
valid_lft 3545sec preferred_lft 3545sec
inet6 fe80::216:3eff:fe40:2c9c/64 scope link
valid_lft forever preferred_lft forever
; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.19.225.1 ubuntu.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 61342
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;ubuntu.com. IN A
;; Query time: 0 msec
;; SERVER: 10.19.225.1#53(10.19.225.1)
;; WHEN: Mon Mar 11 07:46:56 UTC 2019
;; MSG SIZE rcvd: 28
what does all this says ?
The first result says that the container dns resolution is fine, as it should be.
The second could have been interesting if the first one had been negative, but it don’t bring much in this case.
The third says that the container resolver, the specialized dnsmasq instance running on the host, is replying to the container indeed, but it is replying it to get lost (status: REFUSED). Obviously there is something wrong with dnsmasq.
why is dnsmasq so difficult, I have no idea. It’s not a common problem seen often on the internet.
I see 2 ways of going forward:
on the host, ps aux | grep dnsmasq
(when the problem happens)
Maybe this will show something obvious.
Or turning logging on; this can be done by editing network (lxc network edit lxdbr0) and adding a raw.dnsmasq key to define an additional config file, such as:
config:
(…)
raw.dnsmasq: conf-file=/media/root/rawdnsmasq-lxd
you have to create the file and add dnsmasq directives like that
log-queries
log-async
log-facility=/var/log/dnsmasq-lxd.log
I am setting this file under /media/root because you can’t use anything under /etc with snap lxd, it’s the first place that I have found escaping the evil claws of snap.