Effect of val-bogus-ttl

I've run the following test with Unbound 1.3.4:

  1. Set up Unbound with the IANA DNSSEC testbed root and approriate
     trust anchors.

  2. Check that the AD bit is set for some secure RRsets (to verify
     that step 1 has been implemented correctly).

  3. Modify the system such that queries for all name servers of a
     certain domain (let's say "example.com") are answered by a name
     server which has got an A wildcard at the root, does not
     implement DNSSEC, and returns NOERROR for all the DNSSEC RR
     types. (Except for the IPv6 servers, which are not reachable
     anyway despite IPv6 support being enabled in Unbound because the
     kernel does not support IPv6. Disabling IPv6 altogether does not
     seem to make a difference.)

  4. Send a query for www.example.com to the Unbound resolver. As
     expected, it results in a validation failure and a SERVFAIL
     response.

  5. Send another query for www.example.com. It results an immediate
     SERVFAIL response from Unbound. Also expected.

  5. Wait (longer than the configured val-bogus-ttl value).

  6. Send a query for www.example.com. Again, an immediate SERVFAIL
     response is sent by Unbound (and nothing is logged).

The result of step 6 is not what I would expect. I think there should
be a fresh upstream transaction. If I lift the redirection
established in step 3, queries for non-cached names are stilled
answered with SERVFAIL responses, after an upstream transaction.

Looking at the log for the cache-miss case, it seems that Unbound
still caches the NODATA DNSKEY response for example.com:

[1256554227] unbound[28667:0] info: Missing DNSKEY RRset in response to DNSKEY query.

Could Unbound lower the TTL on the DNSKEY RRset in such cases?

Hi Florian,

Could you retry this with current svn trunk of unbound?
The retry logic in case of dnssec failures should blacklist
the cached missing-dnskey-response and try to go to the
network again at step 6.

Best regards,
    Wouter

* W. C. A. Wijngaards:

Hi Florian,

Could you retry this with current svn trunk of unbound?
The retry logic in case of dnssec failures should blacklist
the cached missing-dnskey-response and try to go to the
network again at step 6.

Oh, sorry, I assumed you had released the trunk as 1.3.4, that's why I
retested that version! The trunk almost behaves as I would expect.

I know I'm asking the impossible here (protection against spoofing vs
avoiding becoming a DoS amplifier), but would it be possible to make
Unbound somehow back off when it receives a REFUSED response (with the
proper question section)? I ran basically the same test as before,
this time redirecting DNS traffic to a server which just returned
REFUSED (as a properly configured authoriative name server should when
receiving unexpected requests, IMHO). Unbound seems to adhere to
val-bogus-ttl as well (which is great!), but sends a few more upstream
queries than for the unsigned answer case.

I understand that there is a very difficult trade-off in the current
protocol framework, and I have no good suggestions here. Currently,
name servers under in-protocol reflective attacks typically stop
sending REFUSED responses and let resolvers time out instead, in the
hope that the resolvers will run into load issues (because the maximum
number of parallel client transactions is exceeded) and their clients
get eventually cleaned up. I don't like this, but as I've said, I
haven't got an idea how to deal with this.

Perhaps you can treat a REFUSED response as covering the whole subtree
and all (non-magic) QTYPEs if a secure answer is expected? But this
opens a trivial way to deny resolution to mis-served signed zones.