Use of unbound's interface-automatic option

Greg_A_Woods · February 14, 2009, 1:02am

I've got a setup where I think the "experimental" interface-automatic
option might prove useful, or even necessary.

However I have been unable to get it to work.

The first hurdle was finding that the following patch is necessary
(against 1.2.1) (FYI, I have do-ip4 but not do-ip6 set):

--- services/listen_dnsport.c.orig 2008-09-10 11:23:01.000000000 -0400
+++ services/listen_dnsport.c 2009-02-13 18:38:33.000000000 -0500
@@ -602,8 +602,8 @@
   if(!do_ip4 && !do_ip6) {
     return NULL;
   }
- if(do_auto && (!do_ip4 || !do_ip6)) {
- log_warn("interface_automatic option does not work when either do-ip4 or do-ip6 is not enabled. Disabling option.");
+ if (do_auto && !(do_ip4 || do_ip6)) {
+ log_warn("interface-automatic option does not work when either do-ip4 or do-ip6 is not enabled. Disabling option.");
     do_auto = 0;
   }
   /* create ip4 and ip6 ports so that return addresses are nice. */

(If I'm not mistaken this is the second similar logic bug I've found the
unbound code -- I'd go looking for more but I'm as apt to make identical
mistakes so someone better at reading logic expressions than I and the
authors should have a closer look at the rest of the code!)

The second hurdle is that I'm running on NetBSD-4.x and the first thing
in the log is a stream of errors starting with:

Feb 13 18:47:38 once unbound: [18958:0] info: start of service (unbound 1.2.1).
Feb 13 18:47:43 once unbound: [18958:0] notice: sendmsg failed: Invalid argument
Feb 13 18:47:43 once unbound: [18958:0] notice: remote address is 204.92.254.2 port 62639

I'm guessing this is because either the necessary support for this
option isn't really in NetBSD-4, or it hasn't been ported to work with
NetBSD yet.

So, perhaps I should describe my situation a little more and see if this
is something I should pursue trying to fix on myself or not.

I've got a setup with a pair of NetBSD hosts on a small public subnet,
one being a firewall with NAT for an RFC 1918 office LAN (192.168.0/24),
and the other being a public Internet-facing server. Both servers are
connected to the private LAN, and both run unbound, nearly identical
configured except for interface addresses, as a service to the private
LAN. There's also a second RFC 1918 private LAN (192.168.255/24)
between the hosts used for backups and administrative purposes.

They both also run nsd, the instance on the public server answering
queries for public DNS zones, and the instance on the firewall is
listening only on a second address (192.168.0.254) on the private LAN
interface and answering queries only for the private zones.

The private zones are configured in each unbound instance as stubs like
this:

  stub-zone:
         name: "company.example"
         stub-addr: N.N.N.N # ns.company.example, for example

  stub-zone:
          name: "office.company.example"
          stub-addr: 192.168.0.254

  stub-zone:
          name: "backups.company.example"
          stub-addr: 192.168.0.254

  stub-zone:
          name: "168.192.in-addr.arpa"
          stub-addr: 192.168.0.254

  stub-zone:
          name: "0.168.192.in-addr.arpa"
          stub-addr: 192.168.0.254

  stub-zone:
          name: "255.168.192.in-addr.arpa"
          stub-addr: 192.168.0.254

The firewall's rules do address-spoofing protection, amongst other
things of course, and I'm having trouble getting queries to go from the
server to the nsd instance on the firewall because the firewall's
anti-spoofing rules are blocking unbound's queries due to them having
the wrong source address for the LAN segment they're going out on:

bge0 @350:49 b [192.168.255.2],28073 -> [192.168.0.254],domain PR udp len 20 81 IN

I.e. here was a query from unbound on the server being sent to nsd on
the firewall (eg. a recursive look-up for office.company.example, for
example, as per the stub-addr configuration for that zone) using the
wrong source address. Sometimes it'll be the admin LAN address,
sometimes the public address, and (I think) sometimes the correct
address on the office LAN.

I'm not sure I understand the intent of the "interface-automatic"
option, but I do know that unbound isn't using the right source address
for the subnet it's sending to.

Now I think, IIUC, if I could allow it to use wild-card address for the
outgoing interface then the OS would set the source address correctly.

At the moment I have unbound using the following interfaces on the
server:

  interface: 127.0.0.1
  # XXX we cannot listen here -- nsd listens here!
  ##XXX##interface: N.N.N.N # public IP
  interface: 192.168.0.2
  interface: 192.168.255.2

        outgoing-interface: N.N.N.N # public IP
        outgoing-interface: 192.168.0.2
        outgoing-interface: 192.168.255.2

I'm not sure what trying the wildcard address for outgoing-interface
will do though because of the sentence in unbound.conf(5) under
"outgoing-interface" which says "Outgoing queries are sent via a random
outgoing interface to counter spoofing."

The described functionality is completely unhelpful unless you happen to
be in the very rare situation where global network reachability is
available on all interfaces. In the much more common scenario where all
but one interface serves to connect only to a limited set of subnets the
right things must be done to ensure queries always have the correct
source address matching the interface they are transmitted from.

Now, if I remember correctly it is possible to set BIND-8 up and running
in a similar configuration and it will always get the source address
"right", but I'm not prepared to re-do everything and try. I'd very
much rather get unbound and nsd working.

Wouter · February 14, 2009, 10:17am

Hi Greg,

Let me state up front that this and also the previous logic you found
was exactly the way I meant it.

Interface automatic is only meant for when you enable both ip4 and ip6.

You have to enable IPv6 to have interface automatic work. It uses IPv6
socket options. This maybe why your NetBSD4 errors - if ipv6 is
disabled it may reject the IPv6 options.

Below, I'll go about helping you

Greg A. Woods wrote:

I've got a setup where I think the "experimental" interface-automatic
option might prove useful, or even necessary.

However I have been unable to get it to work.

So, perhaps I should describe my situation a little more and see if this
is something I should pursue trying to fix on myself or not.

Interesting.

I've got a setup with a pair of NetBSD hosts on a small public subnet,
one being a firewall with NAT for an RFC 1918 office LAN (192.168.0/24),
and the other being a public Internet-facing server. Both servers are
connected to the private LAN, and both run unbound, nearly identical
configured except for interface addresses, as a service to the private
LAN. There's also a second RFC 1918 private LAN (192.168.255/24)
between the hosts used for backups and administrative purposes.

A note on the side. You can use interface automatic with IPv6 enabled
on the box but not on the network. Unbound needs to talk to the IPv6
(dual stack) network stack in the kernel for it ... Simply set do-ip6
to yes (and it'll figure out that the ipv6 network is not reachable).

They both also run nsd, the instance on the public server answering
queries for public DNS zones, and the instance on the firewall is
listening only on a second address (192.168.0.254) on the private LAN
interface and answering queries only for the private zones.

The private zones are configured in each unbound instance as stubs like
this:

Aha, you could perhaps use the private-domain: "company.example" and
private-address: 192.168.0.0/16 statements to protect. For extra
protection.

  stub-zone:
         name: "company.example"
         stub-addr: N.N.N.N # ns.company.example, for example

  stub-zone:
          name: "office.company.example"
          stub-addr: 192.168.0.254

  stub-zone:
          name: "backups.company.example"
          stub-addr: 192.168.0.254

  stub-zone:
          name: "168.192.in-addr.arpa"
          stub-addr: 192.168.0.254

  stub-zone:
          name: "0.168.192.in-addr.arpa"
          stub-addr: 192.168.0.254

  stub-zone:
          name: "255.168.192.in-addr.arpa"
          stub-addr: 192.168.0.254

The firewall's rules do address-spoofing protection, amongst other

Thank you for doing BCP 38.

things of course, and I'm having trouble getting queries to go from the
server to the nsd instance on the firewall because the firewall's
anti-spoofing rules are blocking unbound's queries due to them having
the wrong source address for the LAN segment they're going out on:

bge0 @350:49 b [192.168.255.2],28073 -> [192.168.0.254],domain PR udp len 20 81 IN

I.e. here was a query from unbound on the server being sent to nsd on
the firewall (eg. a recursive look-up for office.company.example, for
example, as per the stub-addr configuration for that zone) using the
wrong source address. Sometimes it'll be the admin LAN address,
sometimes the public address, and (I think) sometimes the correct
address on the office LAN.

You can set outgoing-interface, but I see that below, I'll keep reading.

I'm not sure I understand the intent of the "interface-automatic"
option, but I do know that unbound isn't using the right source address
for the subnet it's sending to.

Ah! Interface automatic uses the same interface for *replies* to user
queries. It does not affect the outgoing queries to authority servers.

On your machine I think you should not set the outgoing interface (use
default, which is the wildcard address). And fix the route table on the
machine to use the correct interface for the correct LAN (if it is wrong).

Now I think, IIUC, if I could allow it to use wild-card address for the
outgoing interface then the OS would set the source address correctly.

You can:
outgoing-interface: 0.0.0.0

[For the unbound-users public: ::0 is the wildcard address for IPv6.]

At the moment I have unbound using the following interfaces on the
server:

  interface: 127.0.0.1
  # XXX we cannot listen here -- nsd listens here!
  ##XXX##interface: N.N.N.N # public IP
  interface: 192.168.0.2
  interface: 192.168.255.2

        outgoing-interface: N.N.N.N # public IP
        outgoing-interface: 192.168.0.2
        outgoing-interface: 192.168.255.2

I'm not sure what trying the wildcard address for outgoing-interface
will do though because of the sentence in unbound.conf(5) under
"outgoing-interface" which says "Outgoing queries are sent via a random
outgoing interface to counter spoofing."

Well, with the above setup, it will choose a random outgoing interface
to send every query from. It basically expects every outgoing interface
to be able to reach all of the world.

But this is not true for you, because of your firewall rules. So, I
would simply put the wildcard and let the OS decide. Or only have the
public IP - which the firewall might let through its rules too?

The described functionality is completely unhelpful unless you happen to
be in the very rare situation where global network reachability is
available on all interfaces. In the much more common scenario where all

Yes. If you commission extra IP addresses you can get extra bits of
protection against spoofing. It is uncommon, but the reason I made this
is because it is one of the very few means to get extra anti-spoofing
randomness. That is why support is there - for the paranoid (or people
with IP addresses to burn).

but one interface serves to connect only to a limited set of subnets the
right things must be done to ensure queries always have the correct
source address matching the interface they are transmitted from.

Right. This is really the OS route table job. Unbound lets the OS
choose, using a wildcard address.

Now, if I remember correctly it is possible to set BIND-8 up and running
in a similar configuration and it will always get the source address
"right", but I'm not prepared to re-do everything and try. I'd very
much rather get unbound and nsd working.

OK, I would rather not make 'routing' decisions in unbound. Does this
email help you get things working; or do you need more support from the
unbound code?

Best regards,
Wouter

Greg_A_Woods · February 14, 2009, 8:59pm

Let me state up front that this and also the previous logic you found
was exactly the way I meant it.

Interface automatic is only meant for when you enable both ip4 and ip6.

Hmm, OK, well I think a better way to express that so that it cannot be as easily misunderstood would be to write:

  if (do_auto && !(do_ip4 && do_ip6)) {
    print error...
    return ...;
  }

and omit the (!do_ip4 && !do_ip6) check above it.

That way it reads naturally in English at least: if do_auto and not both do_ip4 and do_ip6.

I also still don't think the expression you had written would do what you want either.

I further guessed that either ip4 _or_ ip6 would work because of surrounding logic.

The statement above checks that at least one of the two option is enabled suggesting that either will suffice. The one above that even ensures do_ip6 is zero if INET6 is not defined.

Then the statements below the one I changed do different things depending on which option was enabled.

will work for both ip4 and ip6, or just either one. There's no hint whatsoever that it will only work with IPv6.

I now see code in another function called by the one I patched that'll try to complain if it thinks IPv6 is not available, however that apparently doesn't fail on my system even though the kernel does not have IPv6 enabled.

You have to enable IPv6 to have interface automatic work. It uses IPv6
socket options.

Well the logic right below the statement I referred to would use _either_ IPv6 options _or_ plain IPv4 options depending on which options are enabled.

Also the logic in set_recvpkt() that will work for both AF_INET6 and AF_INET and I didn't observe any errors from that function, especially not the one indicating that I would have to disable interface-automatic.

(That function really would be better written as a set of variant functions, only one of which might be compiled and linked, thus avoiding all the messy #ifdefs)

So, I'm still confused -- it looks like there is enough support for this feature to work for IPv4 interfaces alone, at least in the code in unbound.

This maybe why your NetBSD4 errors - if ipv6 is
disabled it may reject the IPv6 options.

Well my kernel doesn't have IPv6 compiled into it, but it would appear NetBSD setsockopt() does at least have the right #defines for supporting this feature.

There would need to be more debugging support code added for me to figure out what the error message I noted from the logs actually means.

A note on the side. You can use interface automatic with IPv6 enabled
on the box but not on the network. Unbound needs to talk to the IPv6
(dual stack) network stack in the kernel for it ... Simply set do-ip6
to yes (and it'll figure out that the ipv6 network is not reachable).

My kernels don't support IPv6 at all, and normally I don't even compile applications with INET6 or USE_INET6 defined, but in this case the config.h for unbound doesn't really give me any choice. My headers and libraries do appear to support IPv6, but I don't want it in my kernels when I have no IPv6 network to use -- it just confuses the sysadmins.

Aha, you could perhaps use the private-domain: "company.example" and
private-address: 192.168.0.0/16 statements to protect. For extra
protection.

Yes, I have those configuration items as well. I guess I'll have to restrict queries to the cache too since it will contain both public data and private data. With BIND-8 I used to use ACLs to prevent queries of the private data from all but known addresses. I prefer to leave the public data in the cache publicly accessible even if that also gives the bad guys a bit more of an edge (debugging is still more important to me). However without per-zone ACLs that won't be possible.

Ah! Interface automatic uses the same interface for *replies* to user
queries. It does not affect the outgoing queries to authority servers.

On your machine I think you should not set the outgoing interface (use
default, which is the wildcard address). And fix the route table on the
machine to use the correct interface for the correct LAN (if it is wrong).

Ah, OK I see! The wild-card outgoing interface should work then.

You can:
outgoing-interface: 0.0.0.0

Is that going to be any different than not specifying any outgoing-interface?

Yes. If you commission extra IP addresses you can get extra bits of
protection against spoofing. It is uncommon, but the reason I made this
is because it is one of the very few means to get extra anti-spoofing
randomness. That is why support is there - for the paranoid (or people
with IP addresses to burn).

Ah, OK, that makes sense too -- though of course normally one would probably only use a single physical and logical interface but one with multiple IP address aliases configured on it. Otherwise the physical networking gets a bit more complex, and there's also the problem of having multiple Ethernet MACs for the same host (or not being able to, depending on one's hardware).

OK, I would rather not make 'routing' decisions in unbound.

Agreed -- now in the light of a new day I see there's a lot more complication in trying to write portable code to determine the correct source address to use depending on the destination for a query. One would have to interface to the kernel's routing socket to learn which interface would be used, and then effectively cache a representation of the routing table in order to keep performance from sucking too badly, though it still would be far more expensive and complex than just letting the kernel do the source-address assignment.

Does this
email help you get things working; or do you need more support from the
unbound code?

Yes, I've configured the errant server to use 0.0.0.0 for the outgoing replies and all seems to be working properly now.

Thanks very much for your reply!

Greg_A_Woods · February 15, 2009, 2:49am

A first step towards what I really would like best I think would be the ability to control whether or not recursion is allowed for queries coming from specified addresses while still allowing answers to be given from the cache. Such answers would have the recursion allowed bit turned off too of course.

That way I could configure my cache servers such that recursion would only be done for my own known networks, and I would still allow any other site to query the cache for debugging purposes.

Perhaps the next step would be to never return any records for any domain names containing RFC 1918 data (A RRs or PTR RRs or any other RRs associated with RRs containing A or PTR RRs referring to RFC 1918 data) whenever recursion is not allowed for the query. Some private data might still leak with such a rule, but never enough to give away internal network topology. Alternately maybe all RRs returned in answer to queries sent to private addresses should be flagged to remain private.

I.e. anyone can see anything in my cache except my private data, but they wouldn't be able to force me to try to load anything into my cache. Only clients sending queries from locally "trusted" networks would get full recursion and caching services.

Personally I also think this should be the only way any DNS cache should work -- i.e. it should be the only mode of operation. Public (DNS) data should remain public no matter where it is stored.

Does this all make sense to anyone? Does anyone else want such functionality too?

Aaron_Hopkins · February 15, 2009, 7:26am

I.e. anyone can see anything in my cache except my private data, but they wouldn't be able to force me to try to load anything into my cache. Only clients sending queries from locally "trusted" networks would get full recursion and caching services.

Cache snooping lets anyone see who you've been talking to, when you looked
it up, and when the cache will expire. This can aid many different attacks;
for a cliched example, would you knowingly publish a list of which financial
institutions your users are logged into at any given time? Can you see how
doing so might aid social engineering, phishing, or cross-site-scripting
attacks?

It also complicates the end-user experience. If someone hardcodes my DNS
servers into their machine and moves off of my network, lookups of popular,
cached RRs will mostly work and other lookups will mysteriously fail,
perhaps a week in the future after they've forgotten what they've done. It
seems much more clear to just have nothing work until they fix their config.

Personally I also think this should be the only way any DNS cache should work -- i.e. it should be the only mode of operation. Public (DNS) data should remain public no matter where it is stored.

The fact that it is in a cache or not and when it was retrieved is the
sensitive data, not the public data that was retrieved.

BIND allowing cache snooping when you have recursion disabled is a bug, not
a feature. It shouldn't be pushed into other servers.

-- Aaron

Robert_Edmonds · February 15, 2009, 8:43am

Aaron Hopkins wrote:

Cache snooping lets anyone see who you've been talking to, when you looked
it up, and when the cache will expire.

cache snooping can also facilitate amplification attacks, see RFC 5358.

Ondrej_Sury · February 15, 2009, 12:18pm

I prefer to leave the public data in the cache publicly accessible even
if that also gives the bad guys a bit more of an edge (debugging is still
more important to me). However without per-zone ACLs that won't be
possible.

Use unbound-control dump_cache for debugging.

Perhaps the next step would be to never return any records for any domain
names containing RFC 1918 data (A RRs or PTR RRs or any other RRs associated
with RRs containing A or PTR RRs referring to RFC 1918 data) whenever
recursion is not allowed for the query. Some private data might still leak
with such a rule, but never enough to give away internal network topology.
Alternately maybe all RRs returned in answer to queries sent to private
addresses should be flagged to remain private.

Sorry, but your implication RFC1918 => internal network and it's reverse doesn't
really work, and should not be hardcoded anywhere.

I.e. anyone can see anything in my cache except my private data, but they
wouldn't be able to force me to try to load anything into my cache. Only
clients sending queries from locally "trusted" networks would get full
recursion and caching services.

As Aaron and Robert said before me, this is really bad idea. Also when your
cache is open to anyone, anybody could see TTLs of cached records and adjust
attack window with precision of one second.

Personally I also think this should be the only way any DNS cache should
work -- i.e. it should be the only mode of operation. Public (DNS) data
should remain public no matter where it is stored.

Hope not.

Does this all make sense to anyone?

Sorry, but no.

Does anyone else want such functionality too?

No, and I am strongly against adding this type of functionality anywhere.

Ondrej.

Ondrej_Sury · February 15, 2009, 12:27pm

Hmm, OK, well I think a better way to express that so that it cannot be as
easily misunderstood would be to write:

       if (do_auto && !(do_ip4 && do_ip6)) {
               print error...
               return ...;
       }

I don't know about what is more natural in English, but it seems to me
that original said:

IF (AUTO_is_enabled AND (IPv4_is_disabled OR IPv6 is disabled))

and you changed this to:

IF (AUTO_is_enabled AND NOT (IPv4_is_enabled AND IPv6_is_enabled))

which is much more confusing for me.

Ondrej

PaulWouters · February 15, 2009, 5:29pm

I prefer to leave the public data in the cache publicly accessible even
if that also gives the bad guys a bit more of an edge (debugging is still
more important to me). However without per-zone ACLs that won't be
possible.

Use unbound-control dump_cache for debugging.

You can also use an ACL with "allow_snoop"

Perhaps the next step would be to never return any records for any domain
names containing RFC 1918 data (A RRs or PTR RRs or any other RRs associated
with RRs containing A or PTR RRs referring to RFC 1918 data) whenever
recursion is not allowed for the query. Some private data might still leak
with such a rule, but never enough to give away internal network topology.
Alternately maybe all RRs returned in answer to queries sent to private
addresses should be flagged to remain private.

Sorry, but your implication RFC1918 => internal network and it's reverse doesn't
really work, and should not be hardcoded anywhere.

And there is a mechanism to tune this already too, using "private_domain"
with "private_address" options. No need for hardcoding policy anywhere.

I.e. anyone can see anything in my cache except my private data

You want them to not "use" the cache, but allow them to "debug" the cache.
To me, "debug" is a higher priviledge then "using".

wouldn't be able to force me to try to load anything into my cache. Only
clients sending queries from locally "trusted" networks would get full
recursion and caching services.

As Aaron and Robert said before me, this is really bad idea. Also when your
cache is open to anyone, anybody could see TTLs of cached records and adjust
attack window with precision of one second.

Personally I also think this should be the only way any DNS cache should
work -- i.e. it should be the only mode of operation. Public (DNS) data
should remain public no matter where it is stored.

There is no reason for enforcing your policy or ideas in a hardcoded way
onto others.

Hope not.

Does this all make sense to anyone?

Sorry, but no.

Does anyone else want such functionality too?

No, and I am strongly against adding this type of functionality anywhere.

I second that. And applaud unbound's team for adding the options that
allows everyone to decide their own policy in a very fine grained matter.

Paul

Greg_A_Woods · February 15, 2009, 6:02pm

No, not without recursion enabled it can't.

Ondrej_Sury · February 15, 2009, 6:28pm

Cache snooping lets anyone see who you've been talking to, when you
looked
it up, and when the cache will expire.

cache snooping can also facilitate amplification attacks, see RFC 5358.

No, not without recursion enabled it can't.

Yes, it can. Just spoof query to something which is already in cache
(like root servers).

O.

Greg_A_Woods · February 15, 2009, 7:47pm

RFC 5358 describes an attack which effectively requires the nameserver to perform a recursive lookup for the queries that are part of the attack. To quote the RFC:

  "DNS authoritative servers that do not provide recursion to clients
    can also be used as amplifiers; however, the amplification potential
    is greatly reduced when authoritative servers are used."

"This document's recommendations are
concerned with recursive nameservers only."

I.e. if recursion is _not_ performed for any "foreign" queries then nobody outside of the networks "trusted" by the caching nameserver can succeed at this attack any more than they could succeed at using _any_ and _every_ authoritative nameserver "normally".

I guess what I'm suggesting is something like this, which of course is not quite possible yet with unbound:

  # "trusted" networks can do recursive and non-recursive queries
  access-control: 127/8 allow_snoop
  access-control: 10/8 allow_snoop
  access-control: 172.16/16 allow_snoop
  access-control: 192.168/16 allow_snoop
  access-control: N.N.N.N/24 allow_snoop # site's public IP space

# everyone else can only do non-recursive queries of "public" data
access-control: 0/0 snoop_public

Greg_A_Woods · February 15, 2009, 8:13pm

I.e. anyone can see anything in my cache except my private data

You want them to not "use" the cache, but allow them to "debug" the cache.

Yes, exactly. Well, at least the current cache contents. I've long ago given up on the desire to allow full testing of a DNS caching resolver so that the tester can see how it recursively resolves answers to new queries. My experience now shows that is the current cache contents that are the most important to debugging and testing from remote sites.

To me, "debug" is a higher priviledge then "using".

While that is certainly true for some meanings of "debug", in this case the person doing the debugging may very well "own" the data that is in the remote DNS cache, or they may be answering support queries for people who are at remote sites, etc., etc., etc.

In fact I end up having to debug other people's cache data on an almost daily basis. In recent year I almost always have to gain access to a system on a network their caching nameserver(s) trust in order to do such debugging, and that's not always easy, but it is almost always possible in one way or another. Cache almost never manage to protect their copy of my data from my view anyway -- they just make it very annoying to get at.

Even more hypocritical are those large access providers who might think they are gaining some security advantage by preventing the half of the world they don't provide access to from querying the caching nameservers used by the half of the world they do provide access to. 99.999% of the time the most worrying attacks will come from the networks they "trust" even if they don't provide access to half the world. Sure it might help that they have contractual relationships with the customers who own machines that might attack them, but in practice they almost never exercise the management level controls they could use in order to kick offending customers off their networks (Cogeco being one recent example I know of to the contrary).

While I definitely do worry about attacks that can abuse caching nameservers, I have a very strong desire to keep the public data in them publicly available.

Greg_A_Woods · February 15, 2009, 8:43pm

Cache snooping lets anyone see who you've been talking to, when you looked
it up, and when the cache will expire. This can aid many different attacks;
for a cliched example, would you knowingly publish a list of which financial
institutions your users are logged into at any given time? Can you see how
doing so might aid social engineering, phishing, or cross-site-scripting
attacks?

I'm not convinced making some tiny form of this information available from the local DNS cache is of any more value to an attacker than the myriad of other ways they can learn the same information.

Perhaps if attackers have time machines and they can go back to the moment just before the user triggered a connection to some site of interest then I might be more worried?

Most importantly I will claim for the moment that these kinds of attacks cannot be eliminated by simply preventing cache snooping. They are indicative of flaws in other areas and while they may be mitigated slightly in the near term by preventing cache snooping, they can only be prevented by correcting other flaws.

It also complicates the end-user experience. If someone hardcodes my DNS
servers into their machine and moves off of my network, lookups of popular,
cached RRs will mostly work and other lookups will mysteriously fail,
perhaps a week in the future after they've forgotten what they've done. It
seems much more clear to just have nothing work until they fix their config.

I'm not really concerned at all about such issues. Perhaps it is sad for me to say so, but they are inevitably someone else's problem, not mine.

The fact that it is in a cache or not and when it was retrieved is the
sensitive data, not the public data that was retrieved.

That information is not really any more sensitive than anything else done on a _public_ network.

If anyone can show me any real (i.e. no hand waving or ranting!) attacks where cache snooping is a very important contributor that cannot be replaced by other mechanisms then I'll certainly pay attention.

Aaron_Hopkins · February 15, 2009, 9:00pm

In the last month, there've been a number of multi-day amplification attacks
using spoofed "NS ." queries to ~750,000 nameservers. The requests were 45
bytes and the responses were ~500 bytes, making this 11 to 1 amplification.
The victims (the spoofed sourced addresses) were seeing 5 gigabits of
responses.

See http://www.theregister.co.uk/2009/02/10/new_dns_amplification_attacks/
for the overview and the thread starting with
http://www.merit.edu/mail.archives/nanog/msg14429.html for the details of
one of the attacks.

There aren't 750,000 nameservers authoritative for ".", so why did they all
respond to it? They all either have recursion enabled for the world, or
they allow cache snooping. If your nameservers respond to requests from
anywhere for "dig . ns @your.ns.ip" with anything but Refused, they probably
were participating.

-- Aaron

Ondrej_Sury · February 15, 2009, 11:17pm

I.e. if recursion is _not_ performed for any "foreign" queries then nobody
outside of the networks "trusted" by the caching nameserver can succeed at
this attack any more than they could succeed at using _any_ and _every_
authoritative nameserver "normally".

Sorry, but you are wrong, f.e. see recent attack on ISPrime:

https://www.dns-oarc.net/oarc/articles/upward-referrals-considered-harmful

Ondrej

Robert_Edmonds · February 15, 2009, 11:23pm

Greg A. Woods; Planix, Inc. wrote:

RFC 5358 describes an attack which effectively requires the nameserver
to perform a recursive lookup for the queries that are part of the
attack. To quote the RFC:

  "DNS authoritative servers that do not provide recursion to clients
   can also be used as amplifiers; however, the amplification potential
   is greatly reduced when authoritative servers are used."

  "This document's recommendations are
   concerned with recursive nameservers only."

I.e. if recursion is _not_ performed for any "foreign" queries then
nobody outside of the networks "trusted" by the caching nameserver can
succeed at this attack

wrong. if a recursive nameserver is open to cache snooping, it is an
amplification vector. if it drops or responds to foreign queries with
REFUSED, it is not an amplification vector.

any more than they could succeed at using _any_ and _every_
authoritative nameserver "normally".

wrong. if an authoritative nameserver nameserver responds to queries it
is not authoritative for and responds with a referral, it is an
amplification vector. if it responds to queries it is not authoritative
for with REFUSED, it is not an amplification vector.

responding with REFUSED to unsolicited queries is not an amplification
vector because a REFUSED answer is exactly the same length as the query
being refused. it allows an attacker to further obfuscate the source of
his attack, but it does not amplify the amount of bandwidth at the
attacker's disposal. see:

https://www.dns-oarc.net/oarc/articles/upward-referrals-considered-harmful

I guess what I'm suggesting is something like this, which of course is
not quite possible yet with unbound:

IMO, unbound should not have convergently evolved towards BIND and its
separate allow-query-cache and allow-recursion ACLs. it should have
dropped all rd==0 queries from the beginning, because an rd==0 query
indicates a request for authoritative nameservice.

  # "trusted" networks can do recursive and non-recursive queries
  access-control: 127/8 allow_snoop
  access-control: 10/8 allow_snoop
  access-control: 172.16/16 allow_snoop
  access-control: 192.168/16 allow_snoop
  access-control: N.N.N.N/24 allow_snoop # site's public IP space

  # everyone else can only do non-recursive queries of "public" data
  access-control: 0/0 snoop_public

you can easily achieve this by having one recursive nameserver bound to
an RFC 1918 address which only serves your RFC 1918 clients and knows
about your fake DNS data, and another recursive nameserver bound to a
non-RFC 1918 address which only serves your non-RFC 1918 clients and
does not know about your fake DNS data. that way you avoid mixing fake
and real DNS data in the same cache.

Ondrej_Sury · February 15, 2009, 11:30pm

I'm not convinced making some tiny form of this information available from
the local DNS cache is of any more value to an attacker than the myriad of
other ways they can learn the same information.

I am sure that there are plenty of people who can use information from cache
to prime attacks or use that information just to snoop into one's private life.

Most importantly I will claim for the moment that these kinds of attacks
cannot be eliminated by simply preventing cache snooping. They are
indicative of flaws in other areas and while they may be mitigated slightly
in the near term by preventing cache snooping, they can only be prevented by
correcting other flaws.

So what? We open another privacy and security hole we already trying to close?

It also complicates the end-user experience. If someone hardcodes my DNS
servers into their machine and moves off of my network, lookups of
popular,
cached RRs will mostly work and other lookups will mysteriously fail,
perhaps a week in the future after they've forgotten what they've done.
It
seems much more clear to just have nothing work until they fix their
config.

I'm not really concerned at all about such issues. Perhaps it is sad for me
to say so, but they are inevitably someone else's problem, not mine.

Here's the problem. You are trying to enforce your view, since it's your current
problem. But I hope that's never going to happen in Unbound. We are supposed
to fixup the old wounds and not open them again and again.

The fact that it is in a cache or not and when it was retrieved is the
sensitive data, not the public data that was retrieved.

That information is not really any more sensitive than anything else done on
a _public_ network.

It is. Since anybody around the globe could query the cache - he doesn't have
to be MITM or sitting at the end points.

If anyone can show me any real (i.e. no hand waving or ranting!) attacks
where cache snooping is a very important contributor that cannot be replaced
by other mechanisms then I'll certainly pay attention.

Ok, again. Reasoning "there are plenty of holes" so leave this open as well is
not going to make internet safer.

And I think we are really going offtopic - this is more general DNS issue than
Unbound specific.

Ondrej

Robert_Edmonds · February 16, 2009, 12:42am

Ondřej Surý wrote:

Here's the problem. You are trying to enforce your view, since it's your current
problem. But I hope that's never going to happen in Unbound. We are supposed
to fixup the old wounds and not open them again and again.

And I think we are really going offtopic - this is more general DNS issue than
Unbound specific.

what is unbound specific is that unbound answers rd==0 queries which IMO
it should not and which made this entire pointless thread possible.
(dnscache seems to have not suffered for its decision to drop all rd==0
queries on the floor.)

other examples of user stupidity include publishing howtos like this:

http://www.howtoforge.com/installing-using-unbound-nameserver-on-debian-etch

which recommends the following ACL:

access-control: 0.0.0.0/0 allow

this one is a bit harder to prevent since really obstinate users can ask
for two /1's or four /2's...

PaulWouters · February 16, 2009, 1:50am

what is unbound specific is that unbound answers rd==0 queries which IMO
it should

From the man page:

The allow action does allow nonrecursive queries to access the
local-data that is configured. The reason is that this does not
involve the unbound server recursive lookup algorithm, and
static data is served in the reply. This supports normal opera-
tions where nonrecursive queries are made for the authoritative
data. For nonrecursive queries any replies from the dynamic
cache are refused.

The action allow_snoop gives nonrecursive access too. This give
both recursive and non recursive access. The name allow_snoop
refers to cache snooping, a technique to use nonrecursive
queries to examine the cache contents (for malicious acts).
However, nonrecursive queries can also be a valuable debugging
tool (when you want to examine the cache contents).

It is to support certain common deployment scenarios, that involve
adding static or (LEA) override data, forwarding auth queries, etc.

(dnscache seems to have not suffered for its decision to drop all rd==0
queries on the floor.)

If djb only always followed RFC

Paul