Problems resolving www.iana.org / ianawww.vip.icann.org

Hi,

Is it just me or is Unbound 1.4.7 not able to resolve www.iana.org /
ianawww.vip.icann.org right now ?

Seems to be stuck on vip.icann.org although it did eventually resolve after a really
long time:

[1308386009] libunbound[6072:0] info: iterator operate: query
<vip.icann.org. DNSKEY IN>
[1308386009] libunbound[6072:0] info: processQueryTargets:
<vip.icann.org. DNSKEY IN>
[1308386009] libunbound[6072:0] info: sending query: <vip.icann.org.
DNSKEY IN>
[1308386009] libunbound[6072:0] debug: sending to target:
<vip.icann.org.> 2620:0:2830:296::252#53
[1308386009] libunbound[6072:0] debug: outnetudp udp too short
[1308386010] libunbound[6072:0] debug: iterator[module 1] operate:
extstate:module_wait_reply event:module_event_noreply

http://dnsviz.net/d/www.iana.org/dnssec/
http://dnsviz.net/d/ianawww.vip.icann.org/dnssec/
http://dnssec-debugger.verisignlabs.com/www.iana.org
http://dnssec-debugger.verisignlabs.com/ianawww.vip.icann.org

Seem to be fine, although DNSViz did take a long time.

Maybe some anycast servers are down which are serving the Netherlands
(where I am) ?

I'm also saved a pcap-file and the whole output from unbound-host just in case anyone
wants to have a look at it.

Have a nice day,
    Leen.

Hi,

Leen Besselink wrote:

Is it just me or is Unbound 1.4.7 not able to resolve www.iana.org /

ianawww.vip.icann.org right now ?

Unbound with DNSSEC validation not able to resolve www.iana.org.
BIND9 manages to do it but takes long time because of many timeouts.

It seems that all NS in vip.icann.org returns broken response for
DNSKEY query with UDP. BIND9 retries query with TCP and gets complete
DNSKEY but Unbound does not.

Despite vip.icann.org NS are broken, is Unbound behavior correct?

Hi Leen, Daisuke,

Leen Besselink wrote:

Is it just me or is Unbound 1.4.7 not able to resolve www.iana.org /

ianawww.vip.icann.org right now ?

The reponses for this query, the DNSKEY and the A responses are over 3
Kb. You likely have path MTU trouble. Something is wrong with your
fragments. Perhaps you own firewall is set to stop UDP fragments?

There is the OARC reply size tester to help with that.

The error you see in your logs (I saw your attachments earlier, Leen),
are that the system returns very short (0 byte?) UDP datagrams to
unbound. Likely because of the UDP fragmentation issues, or less likely
because of a server-error at icann.org nameservers.

Unbound with DNSSEC validation not able to resolve www.iana.org.
BIND9 manages to do it but takes long time because of many timeouts.

All the time is because of the PMTU trouble. For the server it seems as
if the packet has disappeared, and after a while, BIND and unbound
attempt to use smaller packets. Where BIND does EDNS@512 (and thus TCP
and it works), Unbound does not implement EDNS@512 (it is against
standard and people oppose it) and thus turns off EDNS altogether, gets
the response but without DNSSEC information, and thus returns SERVFAIL
because it fails validation.

It seems that all NS in vip.icann.org returns broken response for
DNSKEY query with UDP. BIND9 retries query with TCP and gets complete
DNSKEY but Unbound does not.

Yes, because of the different probe.

Despite vip.icann.org NS are broken, is Unbound behavior correct?

------------------

dig @gtm1.lax.icann.org vip.icann.org DNSKEY +dnssec

  <snip>
;; connection timed out; no servers could be reached

dig @gtm1.lax.icann.org vip.icann.org DNSKEY +tcp +dnssec

<very large DNSKEY RRSet and RRSIG>
------------------

It is not really possible for unbound to probe the PMTU trouble
everywhere; it is not DNS-OARC. If you really have to you can configure
a workaround, the edns-size in unbound.conf to 1280 or so and then you
have less PMTU trouble. It is better for the internet if you fix the
PMTU trouble (on your firewall, or from your provider).

Best regards,
   Wouter

Hi Leen, Daisuke,

> Leen Besselink wrote:

>> Is it just me or is Unbound 1.4.7 not able to resolve www.iana.org /
> ianawww.vip.icann.org right now ?

The reponses for this query, the DNSKEY and the A responses are over 3
Kb. You likely have path MTU trouble. Something is wrong with your
fragments. Perhaps you own firewall is set to stop UDP fragments?

There is the OARC reply size tester to help with that.

The error you see in your logs (I saw your attachments earlier, Leen),
are that the system returns very short (0 byte?) UDP datagrams to
unbound. Likely because of the UDP fragmentation issues, or less likely
because of a server-error at icann.org nameservers.

> Unbound with DNSSEC validation not able to resolve www.iana.org.
> BIND9 manages to do it but takes long time because of many timeouts.

All the time is because of the PMTU trouble. For the server it seems as
if the packet has disappeared, and after a while, BIND and unbound
attempt to use smaller packets. Where BIND does EDNS@512 (and thus TCP
and it works), Unbound does not implement EDNS@512 (it is against
standard and people oppose it) and thus turns off EDNS altogether, gets
the response but without DNSSEC information, and thus returns SERVFAIL
because it fails validation.

> It seems that all NS in vip.icann.org returns broken response for
> DNSKEY query with UDP. BIND9 retries query with TCP and gets complete
> DNSKEY but Unbound does not.

Yes, because of the different probe.

> Despite vip.icann.org NS are broken, is Unbound behavior correct?

> ------------------
>> dig @gtm1.lax.icann.org vip.icann.org DNSKEY +dnssec
> <snip>
> ;; connection timed out; no servers could be reached

>> dig @gtm1.lax.icann.org vip.icann.org DNSKEY +tcp +dnssec
> <very large DNSKEY RRSet and RRSIG>
> ------------------

It is not really possible for unbound to probe the PMTU trouble
everywhere; it is not DNS-OARC. If you really have to you can configure
a workaround, the edns-size in unbound.conf to 1280 or so and then you
have less PMTU trouble. It is better for the internet if you fix the
PMTU trouble (on your firewall, or from your provider).

Hi Wouter,

It is actually my normal home ISP-connection and a HE-tunnel and Unbound
is on the firewall itself,
it still took a lot of tries/time.

Have to say I was trying to 'debug' the problem from with unbound-host
behind the firewall, probably
not that smart. I've learned my lesson. :slight_smile:

The ISP-connection can usually get 'normal' 1500 MTU, the HE-tunnel
might obvisously not be able
to get that.

So over IPv4 the Unbound on the firewall should normally not have any
PMTU-issues on the first
(few) hops.

I'll atleast know what to look for when it happends again, maybe
disable/look at account of some
firewall rules to see if that is the cause.

Thanks for taking a look,
    Leen.

Hi Leen,

* W. C. A. Wijngaards:

Commonly, people block ICMP, and over IPv6 this blocks fragments because
ICMP PMTU Discovery ICMP messages need to traverse the firewall. Some
firewalls do not support UDP-connection-tracking with fragmentation on
IPv6 (such as pf). These are random IPv6 hints ... :slight_smile:

For IPv6, the DNS server must fragment to about 1200 bytes per packet,
or cap EDNS0 buffer sizes at about 1150 bytes. I'm not sure how many
servers get this right. I'm not even sure if there's a suitable kernel
interface to achieve that.

The equivalent problem in IPv4 land has been solved, although there are
some DNS hosts who still do not get it right. But IPv4 is much, much
easier because most systems can just send DF=0 packets.

Hi, Wouter. Thanks to reply.

The reponses for this query, the DNSKEY and the A responses are over 3
Kb. You likely have path MTU trouble. Something is wrong with your
fragments. Perhaps you own firewall is set to stop UDP fragments?

You are right. -- my firewall (modem) handles fragments incorrectly.

It seems that my firewall denies all fragments until first fragment
(offset=0) arrives. Most times first fragment from vip.icann.org does
not arrives first at my network. I don't know why but always packets
may be reordered...

There is the OARC reply size tester to help with that.

I wasn't able to find the problem because it sends me 1st fragment first.

Older versions of the Linux kernel used to deliberately send fragments in reverse order. There are some (not very compelling) arguments that this is optimal, but it was uncommon so changed in kernel 2.4 IIRC.

Regardless, the firewall is of course broken.

Hi,

Sorry I didn't report how to fix (at least my problem).
Wouter's workaround for unbound configuration fixed the problem. Thank you.

server:
     edns-buffer-size: 1280

This avoids UDP fragmentation and makes TCP fallback.
Yes it's better if I fix my (broken) firewall but I couldn't because
it's built-in-modem firewall.

Best Regards,

Should edns-buffer-size: be split in two options, one for ipv6 and one
for ipv4? With the ipv6 one using a default 1150?

Paul

Commonly, people block ICMP, and over IPv6 this blocks fragments because
ICMP PMTU Discovery ICMP messages need to traverse the firewall. Some
firewalls do not support UDP-connection-tracking with fragmentation on
IPv6 (such as pf). These are random IPv6 hints ... :slight_smile:

Best regards,
   Wouter

As you mentioned pf specifically I just re-checked some things, it works
with a newer version of pf.

The older version on my home system seems to drop these packets and it
works with pf turned off.

So now I know where the real problem is, thank you.

* Paul Wouters:

For IPv6, the DNS server must fragment to about 1200 bytes per packet,
or cap EDNS0 buffer sizes at about 1150 bytes. I'm not sure how many
servers get this right. I'm not even sure if there's a suitable kernel
interface to achieve that.

Should edns-buffer-size: be split in two options, one for ipv6 and one
for ipv4?

I don't think this is needed. In any case, it's more important to avoid
fragmentation over IPv4. 8-/

With the ipv6 one using a default 1150?

I pulled those numbers out of thin air. I checked more carefully, and
1280 bytes for the entire IPv6 packet (including all IPv6 headers) is
allowed. EDNS0 buffer sizes which are guaranteed to avoid fragmentation
are a bit smaller: 40 bytes for the IPv6 header, and 8 bytes for the UDP
header, plus a variable amount of IPv6 extension headers (which should
not happen in practice). RFC 3226 requires an advertised buffer size of
at least 1220 bytes, which seems to result in packets smaller than the
minimum IPv6 MTU, so that's probably the number that should be the
default.

But maybe we can get authoritative servers to fragment IPv6 responses to
1280 bytes. Then no resolver changes would be needed.