Faa.gov is not resolvable using DNSSEC resolver

Hi,

www.faa.gov can be resolved using a None DNSSEC to 2.20.116.95. However, I failed to resolve this domain using a DNSSEC Unbound-1.4.10 resolver.
The attached trace is the logging of “dig localhost faa.gov” in debug level 5 (Verbosity).

Do you have any idea why this domain is not resolvable using DNS SEC?

Thanks in adv.

Jamal Bouzeryouh

System Engineer OPS-Data
T-Mobile Netherlands BV

(attachments)

DNS_trace.txt (50.7 KB)

Hi Jamal,

Your trace shows that unbound thinks the connection drops MTU 1500+
packets. Faa.gov uses large keys and has a lot of answers above 1480 -
i.e. DNSKEY, NXDOMAIN answers. Thus your trouble likely stems from
fragmentation issues. Your server cannot receive UDP DNS responses that
are larger than 1480 or so.

A simple dig @..faaserver faa.gov DNSKEY +dnssec from the server shows
the timeout it produces, likely.

The best solution is to fix the path that is dropping UDP fragments.
Fix your firewall, upgrade it, change cisco router rules on old
equipment. It must be close to your end, because I can get the
fragments just fine. This is the best fix, because it allows your
server to run better with large responses, and generally cleans up your
network.

The workaround is edns-buffer-size: 1280 in unbound.conf.

A code fix, is in svn trunk development version of unbound. That
version should fallback to smaller edns size automatically for you.

And there are useful MTU size test sites out there too.

Best regards,
   Wouter

I'm seeing the same issues with faa.gov. I had similar issues with .gov addresses a few months ago, problem was with an ACL rule dropping fragmented packets. Removed that rule and things start working again. I do not see any other MTU or fragment issues on our network, yet we cannot resolve faa.gov.

I do not see any other MTU or fragment issues on our network, yet we
cannot resolve faa.gov.

My unbound resolver (svn rev. 2502) servfails faa.gov, too, and so does
DNS-OARC's:

dig +dnssec faa.gov dnskey @149.20.64.21
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 45179

I think this might be a case of Unbound still being too strict on the
algorithm selection. OTOH, it really looks like a downgrade attack:

The DS records chain faa.gov to KSKs 28521 (NSEC3RSASHA1) and 4837
(RSASHA256). The DNSKEY RRSet is signed only by the "weaker" KSK 28521
(and ZSK 26230), not KSK 4837.

So, Unbound doesn't accept the DNSKEY RRSet:

info: Did not match a DS to a DNSKEY, thus bogus.
info: Could not establish a chain of trust to keys for faa.gov. DNSKEY IN
info: validation failure <faa.gov. DNSKEY IN>: signature missing from

162.58.35.104 for key faa.gov. while building chain of trust

The KSK signature also looks a bit odd. You'll see it if you query the
servers with different case. The KSK RRSIG is returned in all-lowercase:

dig +dnssec +norec FaA.GOV dnskey @204.108.10.2
[...]

FaA.GOV. DNSKEY 256 3 7 ; ZSK; alg = NSEC3RSASHA1; key id = 26230
FaA.GOV. DNSKEY 257 3 8 ; KSK; alg = RSASHA256; key id = 4837
FaA.GOV. DNSKEY 257 3 7 ; KSK; alg = NSEC3RSASHA1; key id = 28521
FaA.GOV. RRSIG DNSKEY 7 2 600 20120105145312 20111007145312 26230
faa.gov. RRSIG DNSKEY 7 2 600 20120105145312 20111007145312 28521

Detailed unbound-host log here:
https://www.hauke-lampe.de/temp/unbound-faa-debuglog.txt

BIND however resolves the query and sets "AD" in the answer.

Hauke.

Here unbound logs:

Oct 10 23:20:31 [unbound] [1461:0] info: reply from <faa.gov.> 155.178.206.21#53
Oct 10 23:20:31 [unbound] [1461:0] info: query response was ANSWER
Oct 10 23:20:31 [unbound] [1461:0] info: Did not match a DS to a DNSKEY, thus bogus.
Oct 10 23:20:31 [unbound] [1461:0] info: Could not establish a chain of trust to keys for faa.gov. DNSKEY IN
Oct 10 23:20:31 [unbound] [1461:0] info: validation failure www.faa.gov. A IN

-JimC

Hello,

I like to ask how to handle such problems on a productive resolver.
If a domain is unresolvable, common reasons are
- the remote site does not handle capitalisation correct.
- dnssec is broken
- a bug in unbound

the first can only be fixed by the remote site. If they dont, the domain
stays unresolvabel. Usually my user complain "at home it works!"
Of cource: at home the do not use unbound ...

the second case could be an mtu problem at the local site or misconfigured
dnssec at the remote site.

A bug must be found and fixed. After that a new version mus be tested at
the local site and productive systems must be updated.

That may took days or weeks. The enduser cannot access the domain.

I suggest a lookuptable inside unbound to disable some functions makeing
a domain unresolvable. Lookup key coud be a domain or a server. Lookup result
could be a list of disables functions:
- do not use capitalisation
- do not use edns
- do not use tcp
- thread domain like unsigned

The last one is implemented with the "domain-insecure" statement.
But for all other problems I have no solution today.

Hi,

I do not see any other MTU or fragment issues on our network, yet we
cannot resolve faa.gov.

The MTU is not the only problem, indeed, it is in an algorithm rollover
from 7 to 8 (8 prepublished KSK), but the rollover is botched.

I think this might be a case of Unbound still being too strict on the
algorithm selection. OTOH, it really looks like a downgrade attack:

This is correct. An algorithm rollover has failed (presumably).
Hosts that allow SHA256 but deny SHA1 fail to validate the zone.
The exact downgrade from SHA256 to SHA1 happens here.

The KSK signature also looks a bit odd. You'll see it if you query the
servers with different case. The KSK RRSIG is returned in all-lowercase:

The case issue is not a problem (not even for unbound's 0x20 - because
the first one is fine). I guess they have an offline signer or
something like that (excellent!).

It is one in a string of failed algorithm rollovers in .gov.

BIND however resolves the query and sets "AD" in the answer.

It accepts the algorithm downgrade from the RSASHA256 advertised in the
DS record to RSASHA1 in the zone.

Best regards,
   Wouter

Jamal,

Another option is to temporarilly put it in the cache manually,
so you do not have to change unbound's config file and restart:

unbound-control local_data www.faa.gov. A 88.221.124.95

(I tried using unbound-control local_data www.faa.gov. CNAME www.faa.gov.edgekey.net.
  but that seems to not work?)

Paul

I like to ask how to handle such problems on a productive resolver.
If a domain is unresolvable, common reasons are
- the remote site does not handle capitalisation correct.
- dnssec is broken
- a bug in unbound

the first can only be fixed by the remote site. If they dont, the domain
stays unresolvabel. Usually my user complain "at home it works!"
Of cource: at home the do not use unbound ...

You can set use-caps-for-id: no
In fact, for Fedora and RHEL/EPEL I had to do this since GoDaddy broke the
caps draft a few months ago.

A bug must be found and fixed. After that a new version mus be tested at
the local site and productive systems must be updated.

Testing of a new unbound version or configuration can easilly be staged
though. If you want to keep configurations as much the same as you can,
then you can fix individual domain issues using explicit unbound-control
commands to override or feed/clean the cache.

I suggest a lookuptable inside unbound to disable some functions makeing
a domain unresolvable. Lookup key coud be a domain or a server. Lookup result
could be a list of disables functions:
- do not use capitalisation
- do not use edns
- do not use tcp
- thread domain like unsigned

The last one is implemented with the "domain-insecure" statement.
But for all other problems I have no solution today.

The problem of those first three is that it is not "domain specific"
but "nameserver specific" and could involve the parent name server too.

With unbound 1.4.13, pretty much all EDNS issues are fixed, unless you
yourself are on a fragment dropping network, and even then you can
resolve this using edns-buffer-size: 1480

Paul