Unbound from EPEL segfaults at start

I'm trying to setup an Unbound box, only for testing and evaluation at
this point. I've put RHEL 5.4 and unbound from EPEL on it and get this:

kernel: unbound[3493]: segfault at 00000016000004ae rip
0000003fd48750ed rsp 0000000041b3dee0 error 4
kernel: unbound[3495]: segfault at 00000016000004ae rip
0000003fd48750ed rsp 0000000042f3fee0 error 4

Unbound from EPEL is a bit outdated, of course, so maybe a newer
version would be enough to fix it. Could be a configuration problem,
but I think it ought to die with an error instead of segfaulting in
that case.

Will a newer unbound be available for RHEL5 in the near future, or is
the best option building my own packages?

Maik Zumstrull wrote:

Will a newer unbound be available for RHEL5 in the near future, or is
the best option building my own packages?

Following up on that angle, the .spec file delivered with unbound
appears to be slightly broken (version number not updated).

I will update the RHEL version. I have not heard of segfault issues
before (will scan older email), please forward me all information you
have on this.

Paul (unbound EL-5 maintainer)

I'm trying to setup an Unbound box, only for testing and evaluation at
this point. I've put RHEL 5.4 and unbound from EPEL on it and get this:

kernel: unbound[3493]: segfault at 00000016000004ae rip
0000003fd48750ed rsp 0000000041b3dee0 error 4
kernel: unbound[3495]: segfault at 00000016000004ae rip
0000003fd48750ed rsp 0000000042f3fee0 error 4

On what architecture is this? Can you get a gdb trace on this?

Unbound from EPEL is a bit outdated, of course, so maybe a newer
version would be enough to fix it.

Sorry for being a few versions behind on the EL-5 version. I will try
and fix it over the next couple of days.

Paul

[Accidentally didn't send to the list, resending.]

Paul Wouters wrote:

> I'm trying to setup an Unbound box, only for testing and evaluation
> at this point. I've put RHEL 5.4 and unbound from EPEL on it and
> get this:
>
> kernel: unbound[3493]: segfault at 00000016000004ae rip
> 0000003fd48750ed rsp 0000000041b3dee0 error 4
> kernel: unbound[3495]: segfault at 00000016000004ae rip
> 0000003fd48750ed rsp 0000000042f3fee0 error 4

On what architecture is this?

x86_64

Can you get a gdb trace on this?

Should be possible, I'll get back to you.

> Unbound from EPEL is a bit outdated, of course, so maybe a newer
> version would be enough to fix it.

Sorry for being a few versions behind on the EL-5 version. I will try
and fix it over the next couple of days.

I managed to get 1.4.3 compiled using Fedora CVS, but I don't think the
package is particularly stable. I had zero experience building RPMs
before today, I'm more of a Debian person. It doesn't segfault on
startup though and answers to queries. Upper limit seems to be a little
less than 15000 qps according to resperf, but I haven't fine-tuned the
configuration yet.

Paul Wouters wrote:

> kernel: unbound[3493]: segfault at 00000016000004ae rip
> 0000003fd48750ed rsp 0000000041b3dee0 error 4
> kernel: unbound[3495]: segfault at 00000016000004ae rip
> 0000003fd48750ed rsp 0000000042f3fee0 error 4

On what architecture is this? Can you get a gdb trace on this?

Here's one, but it's not particularly useful:

Program received signal SIGSEGV, Segmentation fault.
0x0000003fd48750ed in realloc () from /lib64/libc.so.6
(gdb) bt
#0 0x0000003fd48750ed in realloc () from /lib64/libc.so.6
#1 0x0000003cbd605db8 in poll_add () from /usr/lib64/libevent-1.1a.so.1
#2 0x0000003cbd6038da in event_add () from /usr/lib64/libevent-1.1a.so.1
#3 0x000000000043ee5f in ?? ()
#4 0x000000000040d528 in ?? ()
#5 0x000000000041079c in ?? ()
#6 0x0000000000409cff in ?? ()
#7 0x000000000040ec9e in ?? ()
#8 0x0000003fd481d994 in __libc_start_main () from /lib64/libc.so.6
#9 0x0000000000407219 in ?? ()
#10 0x00007fff10999a08 in ?? ()
#11 0x0000000000000000 in ?? ()

The most interesting part doesn't have debug symbols. Is there an easy
way to rebuild the package with them?

Install the debug packages. You might need to enable the debug repository
for that. (On Fedora, gcc will tell you the yum command to use, but I don't
think it does that on RHEL yet)

Paul

Paul Wouters wrote:

> The most interesting part doesn't have debug symbols. Is there an
> easy way to rebuild the package with them?

Install the debug packages. You might need to enable the debug
repository for that. (On Fedora, gcc will tell you the yum command to
use, but I don't think it does that on RHEL yet)

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x4304b940 (LWP 3427)]
0x0000003fd48750ed in realloc () from /lib64/libc.so.6
(gdb) bt
#0 0x0000003fd48750ed in realloc () from /lib64/libc.so.6
#1 0x0000003cbd605db8 in poll_add () from /usr/lib64/libevent-1.1a.so.1
#2 0x0000003cbd6038da in event_add () from /usr/lib64/libevent-1.1a.so.1
#3 0x000000000043fc98 in comm_point_create_tcp (base=0x2aaab00409a0, fd=9, num=10, bufsize=65552,
    callback=0x4113b0 <worker_handle_request>, callback_arg=0xc4edc60) at util/netevent.c:1231
#4 0x000000000041e8fd in listen_create (base=0x2aaab00409a0, ports=0xc45e900, bufsize=65552, tcp_accept_count=10,
    cb=0x4113b0 <worker_handle_request>, cb_arg=0xc4edc60) at services/listen_dnsport.c:556
#5 0x0000000000410628 in worker_init (worker=0xc4edc60, cfg=0xc438da0, ports=0xc45e900, do_sigs=<value optimized out>)
    at daemon/worker.c:1025
#6 0x0000000000409965 in thread_start (arg=<value optimized out>) at daemon/daemon.c:344
#7 0x0000003fd5406617 in start_thread () from /lib64/libpthread.so.0
#8 0x0000003fd48d3c2d in clone () from /lib64/libc.so.6

Better.

Can you please test the latest build:

I've fired a resperf at it for 80k packets and it did not crash.

Paul

Paul Wouters wrote:

Can you please test the latest build:

http://koji.fedoraproject.org/koji/buildinfo?buildID=165892

I was just going to, but the machine is unavailable right now (network
maintenance). Should be back in 20 min or so, then I'll do it.

While I'm waiting, that particular machine has already been upgraded to
RHEL 5.5, is the package installable on that or does it still require
5.4? I can use another machine that is still on 5.4. I'll have to
verify that the old package crashes on that one, of course.

I tested it on 5.4, but the build was done against 5.5, so it should work
for you on rhel 5.5.

Paul

Paul Wouters wrote:

>> I'm trying to setup an Unbound box, only for testing and
>> evaluation at this point. I've put RHEL 5.4 and unbound from EPEL
>> on it and get this:
>>
>> kernel: unbound[3493]: segfault at 00000016000004ae rip
>> 0000003fd48750ed rsp 0000000041b3dee0 error 4
>> kernel: unbound[3495]: segfault at 00000016000004ae rip
>> 0000003fd48750ed rsp 0000000042f3fee0 error 4

Can you please test the latest build:

Segfault at startup is apparently gone. Thanks for taking care of this.