Timeout semantics of Unbound differ radically from Bind 9

Hello,

I operate a Tor Exit relay and was initially using Unbound as the
caching DNS resolver. A few days ago the relay failed due to an
interaction between the Tor relay daemon and the request timeout
behavior of Unbound. The only solution was to switch to using Bind 9
as the DNS resolver.

While I appreciate the elegance and persistence of Unbound's timeout
scheme, it breaks Tor and probably breaks other high-volume DNS
requesters that expect the simple ten-second timeout behavior of
'named'.

I suggest a configurable compatibility feature be added to Unbound to
emulate Bind timeout behavior while preserving the Unbound timeout
regime. Unbound would reply to DNS queries with an appropriate
SERVFAIL message after ten seconds while continuing with the usual
persistent effort to resolve the record and then cache the result if
successful.

An open Tor ticket providing details of the aforementioned failure is found at

Tor #18580: exit relay fails with 'unbound' DNS resolver when lots of
requests time-out
https://trac.torproject.org/projects/tor/ticket/18580

Sincerely

While I appreciate the elegance and persistence of Unbound's timeout
scheme, it breaks Tor and probably breaks other high-volume DNS
requesters that expect the simple ten-second timeout behavior of
'named'.

Under the covers, Tor uses eventdns. Looking at the eventdns source
(https://github.com/torproject/tor/blob/master/src/ext/eventdns.c), it
appears that by default it times out after 5 seconds, and considers the
nameserver to be down if it gets 3 timeouts in a row.

If it's down, it blocks all new requests (not just for that domain) and
tries to use the nameserver again after 10, 60, 300, 900, and 3600 seconds.

Unbound would reply to DNS queries with an appropriate SERVFAIL message
after ten seconds while continuing with the usual persistent effort to
resolve the record and then cache the result if successful.

Answering with something within 15 seconds does seem important for eventdns.

However, eventdns also only allows 64 requests to be in flight at once. If
all of those are trying to query domains that are timing out, all other
requests will just wait. So it would actually be better for eventdns if
unbound would answer SERVFAIL immediately if unbound believes all of the
nameservers for a domain are broken and it won't be retrying soon.

                                     -- Aaron

Under the covers, Tor uses eventdns. Looking at the eventdns source
(https://github.com/torproject/tor/blob/master/src/ext/eventdns.c), it
appears that by default it times out after 5 seconds, and considers the
nameserver to be down if it gets 3 timeouts in a row.

If it's down, it blocks all new requests (not just for that domain) and
tries to use the nameserver again after 10, 60, 300, 900, and 3600 seconds.

Thank you for the rapid and insightful analysis. I was just beginning
to think I should mention the eventdns usage in Tor as it might be
relevant. Clearly it is.

However please note that Tor project has modified the eventdns code so
it may not match exactly the behavior of the generic version.

Unbound would reply to DNS queries with an appropriate SERVFAIL message
after ten seconds while continuing with the usual persistent effort to
resolve the record and then cache the result if successful.

Answering with something within 15 seconds does seem important for eventdns.

However, eventdns also only allows 64 requests to be in flight at once. If
all of those are trying to query domains that are timing out, all other
requests will just wait. So it would actually be better for eventdns if
unbound would answer SERVFAIL immediately if unbound believes all of the
nameservers for a domain are broken and it won't be retrying soon.

Interesting! This explains why Tor relay DNS completely seizes up
when GoDaddy null-routes a relay running Unbound.

Now I have to look into whether that 64 in-flight limit might be a
performance constraint for fast exit relays. Might want a tunable to
increase the limit.

If the Unbound team decides to create an eventdns / Tor daemon
compatibility feature please let me know via this thread.

Regards

yOn Sun, 10 Apr 2016, Dhalgren Tor wrote:

However please note that Tor project has modified the eventdns code so
it may not match exactly the behavior of the generic version.

The generic version seems to have 3 retries at 3 seconds each, before
considering the nameserver dead. So Tor's version is more generous with the
timing.

Interesting! This explains why Tor relay DNS completely seizes up
when GoDaddy null-routes a relay running Unbound.

It's worse than that, I think. My read of this code suggests that if
unbound fails to answer for any single query 3 times in a row, eventdns
marks that copy of unbound as dead for at least 10 seconds and starts
exponentially backing off use of it, up to an hour.

This is a desirable characteristic if one of your 3 nameservers is broken;
you'll stop sending it requests and your users won't keep waiting on
responses that will never come. Or if you are querying some recursive
nameserver who doesn't want traffic from you and blackholes you, you'll stop
throwing them a large volume of unwanted traffic.

According to https://www.unbound.net/documentation/info_timeout.html,
unbound should already be returning SERVFAIL immediately if it believes all
servers are dead. And SERVFAIL should also be returns after all servers are
queried (and timed out) 5 times. I suspect that can take more than 15
seconds and I don't see a way to put an upper bound on that, though.

Now I have to look into whether that 64 in-flight limit might be a
performance constraint for fast exit relays. Might want a tunable to
increase the limit.

It looks like eventdns will respect an /etc/resolv.conf entry for
"max-inflight: 1000" or similar. If you are limited by inflight requests,
this could be an easy workaround.

                                     -- Aaron

Thank you for your help Aaron.

Please correct me if I'm wrong, but my understanding is now

1) 'named' responds with a timeout SERVFAIL after 10 seconds while
Unbound does not. This difference is the trigger of the problem that
occurred with the relay. I verified the 'named' behavior with

time dig +retry=0 +time=20 @127.0.0.1 <unresolvable_domain>
# SERVFAIL after 10 seconds

but have not gone back to Unbound with this command.

2) The DNS lockup experienced on the Tor relay was in effect a DOS
where a Tor ab/user was requesting large numbers of unresolvable (due
to a null-route) GoDaddy domains. This request flood was exhausting
the 64 in-flight eventdns slots and also triggering eventdns to mark
the single resolv.conf entry pointing to Unbound "down".

3) The reason that it works with 'named' under the DOS described in
(2) is the 'named' 10-second SERVFAIL timeout response not present
when Unbound is the resolver. This is enough to tip the balance away
from a DOS state.

4) Proper way to prevent the DOS in the Unbound resolver scenario is
to tune eventdns with a resolv.conf line similar to

   options timeout:5 attempts:1 max-inflight:4096 max-timeouts:100

Where timeout:5 is the usual value appropriate for a Tor daemon (Tor
clients shift to another relay and retry on DNS failures). Attempts:1
assumes that the resolver is a local Unbound instance where Unbound
will handle all timeout retry processing and no UDP loss is possible
between the 'tor' process and the local Unbound, so it's best to give
up directly after five seconds. Max-inflight:4096 both mitigates the
DOS scenario experienced and maximizes DNS performance of the exit
relay. Max-timeouts:100 should prevent eventdns from marking the
dedicated local resolver as "down" unless it really is down. Perhaps
max-timeouts:1000000 is better in order to completely inhibit the
timed-out "down resolver" logic.

5) Seems to me that 'named' does not retry standard resolve requests
at all, assuming that the client application (usually glibc
libresolv.so) will handle this function. Therefore a reasonable
resolv.conf line for running a 'tor' daemon with 'named' would be

   options timeout:5 attempts:2 max-inflight:4096 max-timeouts:100

If you see anything mistaken in any of the above please let me know.

Thanks

I can't comment on the unbound/Tor specifics, but note that it's totally possible to get bind into a state where it won't respond at all in the cited time windows, if upstream connectivity is broken. I've seen this on a number of occasions.

Am I correct in understanding that the Tor/eventdns code is, effectively, self-DoS'ing by marking all of its nameservers dead for increasingly long backoff times?

Regards,
Phil