BINDs views in unbound

Hi,

we run public cache servers for our customers and our internal servers.
we are using binds views (internal/external) to hide unroutable
resource records from public in some zones.

I can achieve bind views functionality in unbound with two unbound daemons:

- firs unbound daemon is listening on all interfaces and has no
local-zone/local-data entries.

- second unbound is listening on localhost and different port:
    server:
        port: 54
        interface: 127.0.0.1
        local-zone: myzone.lv transparent
        include: /usr/local/etc/unbound/zone-myzone.lv

- redirect internal hosts to localhost (FreeBSD pf):
    table <int-dns> const { 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, ... }
    rdr pass proto udp from <int-dns> to port 53 -> 127.0.0.1 port 54
    rdr pass proto tcp from <int-dns> to port 53 -> 127.0.0.1 port 54

If query comes from our internal servers, it is redirected to second
unbound instance where it checks local-data and if no entry is found,
it is resolved as usual.
If query comes from public hosts, they don't see our rfc1918 records.

Is this kind of setup okay? Maybe it can be done with one unbound daemon?

actually this don't work, sockets are conflicting?:
Mar 27 11:21:02 cache unbound: [10703:3] notice: sendmsg failed: Can't
assign requested address
Mar 27 11:21:02 cache unbound: [10703:3] notice: remote address is
192.168.195.39 port 43962

there is lot of such entries for different ips, and unbound sometimes
is not answering queries.

what means these entries?
Mar 27 11:30:24 cache unbound: [10784:3] notice: sendto failed: Invalid argument
Mar 27 11:30:24 cache unbound: [10784:3] notice: remote address is
::ffff:209.66.91.13 port 53

Hi Artis,

Neat trick!

Artis Caune wrote:

I can achieve bind views functionality in unbound with two unbound daemons:

- firs unbound daemon is listening on all interfaces and has no
local-zone/local-data entries.

- second unbound is listening on localhost and different port:
   server:
       port: 54
       interface: 127.0.0.1
       local-zone: myzone.lv transparent
       include: /usr/local/etc/unbound/zone-myzone.lv

- redirect internal hosts to localhost (FreeBSD pf):
   table <int-dns> const { 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, ... }
   rdr pass proto udp from <int-dns> to port 53 -> 127.0.0.1 port 54
   rdr pass proto tcp from <int-dns> to port 53 -> 127.0.0.1 port 54

Is this kind of setup okay? Maybe it can be done with one unbound daemon?

actually this don't work, sockets are conflicting?:
Mar 27 11:21:02 cache unbound: [10703:3] notice: sendmsg failed: Can't
assign requested address
Mar 27 11:21:02 cache unbound: [10703:3] notice: remote address is
192.168.195.39 port 43962

there is lot of such entries for different ips, and unbound sometimes
is not answering queries.

This is because you bound the second unbound only to 127.0.0.1 and from
there it cannot sendmsg back to client.
use interface: 0.0.0.0
or interface-automatic: yes

Don't forget to pf so only internal network can reach port 54 directly,
and give your second unbound access-control for your internal network.

what means these entries?
Mar 27 11:30:24 cache unbound: [10784:3] notice: sendto failed: Invalid argument
Mar 27 11:30:24 cache unbound: [10784:3] notice: remote address is
::ffff:209.66.91.13 port 53

Unbound tries to disable ipv4 to ipv6 mapping. But this still happened.
It tries to send back, but the OS doesn't like it. This should not
happen with the default config, this is for your first unbound? What is
its config?
For this also, interface-automatic: yes may solve it (it actually
enables the mapping and uses it...). Or some config changes. Or
disable ipv4toipv6-mapping-by-default with some FreeBSD sysctl; unbound
tries to set a socket option but the kernel does not seem to honor it.

Best regards,
   Wouter

This is because you bound the second unbound only to 127.0.0.1 and from
there it cannot sendmsg back to client.
use interface: 0.0.0.0
or interface-automatic: yes

Don't forget to pf so only internal network can reach port 54 directly,
and give your second unbound access-control for your internal network.

I was already using interface-automatic:
    port: 54
    interface: 127.0.0.1
    interface-automatic: yes

Now I changed interface to 0.0.0.0, ::0, disabled interface-automatic,
changed redirect from 127.0.0.1 to public ip and it works, thanks.

I have another strange problem, unbound is freezing and not answering
queries. It happened two times. I can not restart it.
It just prints
    info: service stopped (unbound 1.2.1)
and I have to send KILL signal to it.
It happens often when I restart unbound. top shows it's in umtxn state:

10784 59 4 47 0 539M 479M umtxn 0 2:20 0.00% unbound

Unbound tries to disable ipv4 to ipv6 mapping. But this still happened.
It tries to send back, but the OS doesn't like it. This should not
happen with the default config, this is for your first unbound? What is
its config?
For this also, interface-automatic: yes may solve it (it actually
enables the mapping and uses it...). Or some config changes. Or
disable ipv4toipv6-mapping-by-default with some FreeBSD sysctl; unbound
tries to set a socket option but the kernel does not seem to honor it.

I'll check ipv6 options.

I use interface-automatic, without it unbound reply with another ip address:

;; reply from unexpected source: 91.198.156.20#53, expected 91.198.156.8#53

yes, this is my firs unbound :slight_smile:

out setup is (average 1-2K qps):
interface bce0: 91.198.156.20, alias 91.198.156.8
interface bce1: only ipv6 address

unbound-1.2.1
libevent-1.4.9

unbound config is:

server:
    extended-statistics: no
    num-threads: 4
    interface: 0.0.0.0
    interface: ::0
    interface-automatic: yes
    outgoing-range: 8192
    outgoing-num-tcp: 64
    incoming-num-tcp: 64
    msg-cache-size: 512m
    msg-cache-slabs: 8
    num-queries-per-thread: 8192
    rrset-cache-size: 1g
    rrset-cache-slabs: 8
    cache-max-ttl: 86400
    infra-lame-ttl: 1800
    infra-cache-slabs: 8
    infra-cache-numhosts: 16384
    infra-cache-lame-size: 16k
    access-control: 0.0.0.0/0 allow
    access-control: ::0/0 allow
    chroot: ""
    use-syslog: yes
    pidfile: "/var/run/unbound.pid"
    hide-identity: yes
    hide-version: yes
    key-cache-slabs: 8
    neg-cache-size: 256m

remote-control:
    control-enable: yes
    control-interface: 127.0.0.1
    control-port: 953

I can reproduce this quiet easy on FreeBSD 7.1-STABLE and 7.0-RELEASE.
If I change num-threads to something other than 1, it's always stuck
in umtxn state.

If I build without libevent, works great.

I run:
# /usr/local/etc/rc.d/unbound start
# /usr/local/etc/rc.d/unbound stop
...

Hi Artis,

Artis Caune wrote:

Now I changed interface to 0.0.0.0, ::0, disabled interface-automatic,
changed redirect from 127.0.0.1 to public ip and it works, thanks.

Oh great!

I have another strange problem, unbound is freezing and not answering
queries. It happened two times. I can not restart it.
It just prints
    info: service stopped (unbound 1.2.1)
and I have to send KILL signal to it.
It happens often when I restart unbound. top shows it's in umtxn state:

10784 59 4 47 0 539M 479M umtxn 0 2:20 0.00% unbound

This looks similar to pthread_mutex_destroy() hang in FreeBSD7 reported
last year, same umtxn state. Bit of searching revealed no workaround.
When did you last update your freebsd; may be different in 7-STABLE
versus 7-CURRENT; or cvsup... ?

If you cannot get (I assume this is a pthread problem) it fixed, one
workaround is to compile unbound, configure --without-pthreads . It
uses 4x as much memory as before, but doesn't call
pthread_mutex_destroy() anymore...

I use interface-automatic, without it unbound reply with another ip address:

;; reply from unexpected source: 91.198.156.20#53, expected 91.198.156.8#53

You can solve this using interface-automatic yes, but also with:
  interface: 91.198.156.20
  interface: 91.198.156.8
This sort of problem happens when you have aliases on the interface; the
problem is that it is hard to tell the kernel where to reply from (apart
from weird socket options(interface-automatic) or different sockets (the
above config)).

yes, this is my firs unbound :slight_smile:

out setup is (average 1-2K qps):
interface bce0: 91.198.156.20, alias 91.198.156.8
interface bce1: only ipv6 address

unbound-1.2.1
libevent-1.4.9

unbound config is:

nice.

Best regards,
   Wouter

Hi Artis,

Can you set verbosity to 4 or 5 (you can use unbound-control just before
you restart) and show me the last lines before it hangs?

Best regards,
   Wouter

Artis Caune wrote:

unbound-control just hangs and logs prints nothing.
I use stop and also reload.

I run 7.1-STABLE #0 r186761: Mon Jan 5 11:46:44 EET 2009

[1238159656] unbound[31837:0] debug: module config: "validator iterator"
[1238159656] unbound[31837:0] notice: init module 0: validator
[1238159656] unbound[31837:0] debug: validator nsec3cfg keysz 1024 mxiter 150
[1238159656] unbound[31837:0] debug: validator nsec3cfg keysz 2048 mxiter 500
[1238159656] unbound[31837:0] debug: validator nsec3cfg keysz 4096 mxiter 2500
[1238159656] unbound[31837:0] notice: init module 1: iterator
[1238159656] unbound[31837:0] debug: target fetch policy for level 0 is 3
[1238159656] unbound[31837:0] debug: target fetch policy for level 1 is 2
[1238159656] unbound[31837:0] debug: target fetch policy for level 2 is 1
[1238159656] unbound[31837:0] debug: target fetch policy for level 3 is 0
[1238159656] unbound[31837:0] debug: target fetch policy for level 4 is 0
[1238159656] unbound[31837:0] debug: no config, using builtin root hints.
[1238159656] unbound[31837:0] debug: donotq: 127.0.0.0/8
[1238159656] unbound[31837:0] debug: donotq: ::1
[1238159656] unbound[31837:0] debug: total of 59751 outgoing ports available
[1238159656] unbound[31837:0] debug: start threads

Hi Artis,

Tried to reproduce on 7.1-STABLE machine, with unbound-1.2.1 with
libevent 1.4.9-stable. I can start it, query it, kill -HUP,
unbound-control reload, all I like, and it just works.

Once unbound hangs, so does unbound-control...

So what is really the sequence of actions here?

(FYI, it works for me on FreeBSD 6,7,8, so there must be some
difference, first I though this was libevent-1.4.9 version, but that
works on our FreeBSD 7 machine too).

Best regards,
   Wouter

Artis Caune wrote:

This is really weird, I found what's wrong, sorry for noise:

I installed original FreeBSD 7.1, added our pre-builded packages for
unbound and same thing, it hangs in umtxn.
I deleted all packages, portsnapped ports, installed unbound with
libevent and it just works :slight_smile:

And then I realized, that while 'make installing' unbound, it did not
fetched openssl dependency.
On our custom build FreeBSD release we use openssl from ports, bundled
openssl is only for geli and other base system stuff.

# ldd /usr/local/sbin/unbound (this not working)
/usr/local/sbin/unbound:
  libssl.so.5 => /usr/local/lib/libssl.so.5 (0x8006b6000)
  libcrypto.so.5 => /usr/local/lib/libcrypto.so.5 (0x800a20000)

# ldd /usr/local/sbin/unbound (this works okay)
/usr/local/sbin/unbound:
  libssl.so.5 => /usr/lib/libssl.so.5 (0x8006b6000)
  libcrypto.so.5 => /lib/libcrypto.so.5 (0x800a1e000)

I diffed configure output and found this:

--- bad.configure
+++ good.configure

-checking for SSL... found in /usr/local
+checking for SSL... found in /usr

-checking whether pthreads work without any flags... yes
+checking whether pthreads work without any flags... no
+checking whether pthreads work with -Kthread... no
+checking whether pthreads work with -kthread... no
+checking for the pthreads library -llthread... no
+checking whether pthreads work with -pthread... yes

-configure: running /bin/sh ./configure '--prefix=/usr/local'
'--with-ssl=/usr/local' '--with-libevent=/usr/local'
'--mandir=/usr/local/man' '--infodir=/usr/local/info/'
'--build=amd64-portbld-freebsd7.1'
'build_alias=amd64-portbld-freebsd7.1' 'CC=cc' 'CFLAGS=-O2
-fno-strict-aliasing -pipe' 'LDFLAGS= -rpath=/usr/local/lib'
--cache-file=/dev/null --srcdir=.
+configure: running /bin/sh ./configure '--prefix=/usr/local'
'--with-ssl=/usr' '--with-libevent=/usr/local'
'--mandir=/usr/local/man' '--infodir=/usr/local/info/'
'--build=amd64-portbld-freebsd7.1'
'build_alias=amd64-portbld-freebsd7.1' 'CC=cc' 'CFLAGS=-O2
-fno-strict-aliasing -pipe' 'LDFLAGS= -rpath=/usr/lib:/usr/local/lib'
--cache-file=/dev/null --srcdir=.

-checking for SSL... found in /usr/local
+checking for SSL... found in /usr

and in make output there was only include flag diffs:

--- bad.make
+++ good.make

- ... -I/usr/include -I/usr/local/include ...
+ ... -I/usr/local/include -I/usr/local/include ...

and I think this is wrong, include path should be -I/usr/local/include
-I/usr/include, but if I change this in ./Makefile and
ldns-src/Makefile, still it freeze.

btw I have openssl-0.9.8j.
I have no idea why it's freezing with openssl from ports.

I missed that it's also missing "-pthread" flag in cc and ./libtool
(due to very long lines)

so that explains all.

I tried to fix all flags (as I did) and added "-pthread" to CFLAGS and
yes, my unbound is working with openssl from ports without freezing
:))))

Hi Artis,

I am glad that you found out.

It turns out that in the configure tests the -lssl was changing the
outcome of the later -pthread test. And without that flag it would end
up using thread-unsafe library calls. I've fixed in svn trunk r1569 so
that the pthread test is done before the other libraries(SSL, python).

(I guess your /usr/local/libssl and /usr/lib/libssl are different, one
has libthr linked in, which makes the pthread tests act awkward, since
pthreads seem to be available without any flags)

Thanks,

Best regards,
   Wouter

Artis Caune wrote:

Thanks, I'll try r1569 patch and submit FreeBSD PR.
We are now ready for 1.april :wink:

I have another question about memory usage.
I read in "Howto Optimise" that actual memory usage can grow 2.5x
times of all configured memory.

If I have:
    num-threads: 4
    msg-cache-size: 512m
    rrset-cache-size: 1g
    neg-cache-size: 256m
it can grow to 4+ gigs?

If I use 4 threads, memory is shared ?
With processes it's x4 ?

Hi Artis,

Artis Caune wrote:

It turns out that in the configure tests the -lssl was changing the
outcome of the later -pthread test. And without that flag it would end
up using thread-unsafe library calls. I've fixed in svn trunk r1569 so
that the pthread test is done before the other libraries(SSL, python).

Thanks, I'll try r1569 patch and submit FreeBSD PR.
We are now ready for 1.april :wink:

Happy to help.

I have another question about memory usage.
I read in "Howto Optimise" that actual memory usage can grow 2.5x
times of all configured memory.

Basically, the malloc in modern OSes can have about 2x the amount the
program uses due to malloc-overhead, I added another .5x as leeway
(since I don't exactly know how the malloc is implemented, the 2x is
based on the algorithms it uses).

If I have:
    num-threads: 4
    msg-cache-size: 512m
    rrset-cache-size: 1g
    neg-cache-size: 256m
it can grow to 4+ gigs?

Well about 3.5G is twice as much. Add a bit of overhead.

If I use 4 threads, memory is shared ?
With processes it's x4 ?

With threads, the memory is shared, indeed.
With processes, every thread has its own cache basically, nonshared
memory, so its 4x3.5g = 14g.

Oh by the way you forgot the key-cache; its 4M by default. Only really
needed if you do a lot of validation. And there are many signed zones
out to validate. Also the infrastructure cache takes up memory. So
perhaps 4G is a good estimate.

(obligatory remark that unbound has more than just the cache. The cache
is the dominant factor for memory usage. The rest is a couple Mb.)

Best regards,
   Wouter

OFFTOPIC

Thanks, I'll try r1569 patch and submit FreeBSD PR.
We are now ready for 1.april :wink:

out setup is (average 1-2K qps):
interface bce0: 91.198.156.20, alias 91.198.156.8

Hi Artis.

Hey, is it Latvias public DNS cache server ns.nic.lv provided by Latnet?
You guys are going to switch him from bind to unbound? =)

Good luck!

:slight_smile:

OFFTOPIC

Hi Artis.

Hey, is it Latvias public DNS cache server ns.nic.lv provided by Latnet?

Hi Beastie,

yes - it's Latvian public cache server and no - it's provided by NIC.

You guys are going to switch him from bind to unbound? =)

we already did!