Unbound mysteriously crashes

Hi All,

We're in the process of replacing all of our caching nameservers with
Unbound. During testing, it seems that Unbound is crashing and I'm unsure
why. I've recompiled it with debugging enabled (--enable-debug) and
"verbosity" set to 4. This is what the last few lines shows:

Here's a different error:

[1229998910] unbound[11244:0] debug: worker svcd callback for qstate
0x91eba60
[1229998910] unbound[11244:1] debug: servselect ip4 87.118.120.30 port
53 (len 16)
[1229998910] unbound[11244:1] debug: rtt=1504
[1229998910] unbound[11244:1] debug: selrtt 1504
[1229998910] unbound[11244:0] debug: mesh_run: start
[1229998910] unbound[11244:1] info: sending query:
<km32134.keymachine.de. AAAA IN>
[1229998910] unbound[11244:1] debug: sending to target:
<km32134.keymachine.de.> 87.118.120.30#53
[1229998910] unbound[11244:0] debug: iterator[module 1] operate:
extstate:module_wait_reply event:module_event_reply
[1229998910] unbound[11244:1] debug: serviced query UDP timeout=1504
msec
[1229998910] unbound[11244:0] info: iterator operate: query <cname.net-
music-online.com.tpgi.com.au. A IN>
[1229998910] unbound[11244:1] debug: inserted new pending reply
id=0d91
[1229998910] unbound[11244:0] debug: process_response: new external
response event
[1229998910] unbound[11244:1] debug: opened UDP if=0 port=52739
[1229998910] unbound[11244:1] debug: comm point start listening 118
[1229998910] unbound[11244:0] info: scrub for <tpgi.com.au. NS IN>
[1229998910] unbound[11244:1] fatal error: util/netevent.c:228:
comm_point_send_udp_msg: assertion ldns_buffer_remaining(packet) > 0
failed

Forgot to add, the version of Unbound used is the latest pulled from SVN. I was
using version 1.1.1 and it was still crashing so I'm using the one from SVN.

Regards,

Haw

Hello list,

I ran into an interesting situation while using the local-data feature
in Unbound.

Here is the situation:

There is a domain, let's say it is 'domain.nl', with a FQDN
'www.domain.nl', which is available from the entire Internet. It is
served from ns.example.com.

There is also an override on my local Unbound-resolver:
'intra.domain.nl'. This should only be locally served, obviously.

In unbound.conf I configured:

local-zone: "domain.nl." transparent
local-data: "intra.domain.nl A 192.168.1.1"

Now, this works fine, with one exception:

Many applications ask for AAAA-records nowadays. Indeed my application
asks for 'AAAA intra.domain.nl'. In this case, Unbound (or rather
ns.example.com, I guess) returns an NXDOMAIN. This is understandable,
since there is no A record for 'intra.domain.nl' under the 'domain.nl'
at ns.example.com (there is only a local override in Unbound). But it is
also an undesirable situation, since some resolvers run into problems
and won't resolve the A record anymore:
http://support.microsoft.com/kb/815768

Wouldn't it be better if Unbound would change the NXDOMAIN answer from
ns.example.com into a NOERROR when it has an A-record equivalent of the
AAAA-question available? Or maybe a similar solution to prevent the
problem described above?

I think I had found a workaround by adding this in unbound.conf:

local-data: "intra.domain.nl AAAA"

An empty AAAA record.

This results in the desired NOERROR answer, but instead of the ANSWER:
being 0, it is 1:

; <<>> DiG 9.5.0-P2 <<>> AAAA intra.domain.nl
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7651
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;intra.domain.nl. IN AAAA

(This worked for Unbound 1.0, but Unbound 1.1 fails to start when I try
this workaround)

Regards,

Hi Marco,

Marco Davids wrote:

Hello list,

I ran into an interesting situation while using the local-data feature
in Unbound.

Here is the situation:

There is a domain, let's say it is 'domain.nl', with a FQDN
'www.domain.nl', which is available from the entire Internet. It is
served from ns.example.com.

There is also an override on my local Unbound-resolver:
'intra.domain.nl'. This should only be locally served, obviously.

In unbound.conf I configured:

local-zone: "domain.nl." transparent
local-data: "intra.domain.nl A 192.168.1.1"

Now, this works fine, with one exception:

Many applications ask for AAAA-records nowadays. Indeed my application
asks for 'AAAA intra.domain.nl'. In this case, Unbound (or rather
ns.example.com, I guess) returns an NXDOMAIN. This is understandable,
since there is no A record for 'intra.domain.nl' under the 'domain.nl'
at ns.example.com (there is only a local override in Unbound). But it is
also an undesirable situation, since some resolvers run into problems
and won't resolve the A record anymore:
http://support.microsoft.com/kb/815768

More specifically, ns.example.com returns NXDOMAIN because it has no RR
record at all with the owner dname intra.domain.nl.

Since the local-zone is set to transparant, unbound looks up the answer
locally first, and if it is not there, it performs the query.
ns.example.com would then return NXDOMAIN.

Wouldn't it be better if Unbound would change the NXDOMAIN answer from
ns.example.com into a NOERROR when it has an A-record equivalent of the
AAAA-question available? Or maybe a similar solution to prevent the
problem described above?

I think indeed this might be useful in the transparent mode.

- - Matthijs

Hi Haw,

I see that you build Unbound with libevent. Do these problems occur when
you build without libevent?

<quote>usually problem goes away if libevent is not used</quote>

Let us know.

Happy holidays,

Matthijs Mekking
NLnet Labs

Haw Loeung wrote:

Hi Matthijs,

I see that you build Unbound with libevent. Do these problems occur when
you build without libevent?

<quote>usually problem goes away if libevent is not used</quote>

First of all, thank you for replying.

It seems to be running fine without libevent. We tested this before putting
real load onto these servers. However, we need to build Unbound with
libevent because there were too many requests being dropped
(total.requestlist.exceeded was off the roof).

Regards,

Haw

Hi,

Did you try libevent-1.4.8-stable? (or later?)
Other users had a good experience with that libevent version.

Best regards,
   Wouter

Haw Loeung wrote:

Hi Wouter,