At 17:39 today we restarted the service unbound with "unbound-control reload". At 18:54 we got an fatal error:
Jun 27 18:54:00 isp-nscache-01 unbound: [1733:0] error: validator: bad event module_event_reply
Jun 27 18:54:00 isp-nscache-01 unbound: [1733:0] fatal error: services/outside_network.c:1504: serviced_callbacks: pointer whitelist fptr_whitelist_serviced_query(p->cb) failed
after that we needed to start the service:
Jun 27 19:12:42 isp-nscache-01 unbound-anchor: /var/lib/unbound/root.key has content
Jun 27 19:12:42 isp-nscache-01 unbound-anchor: success: the anchor is ok
Jun 27 19:12:42 isp-nscache-01 unbound: [12534:0] warning: did not exit gracefully last time (1733)
Jun 27 19:12:42 isp-nscache-01 unbound: [12535:0] notice: init module 0: validator
Jun 27 19:12:42 isp-nscache-01 unbound: [12535:0] notice: init module 1: iterator
Jun 27 19:12:42 isp-nscache-01 unbound: [12535:0] info: start of service (unbound 1.4.16).
What happend and what can we do that is not happen again?
The unbound service is handeling 2044 query's per seconde. We are running unbound version 1.4.16?
At 17:39 today we restarted the service unbound with
"unbound-control reload". At 18:54 we got an fatal error:
Jun 27 18:54:00 isp-nscache-01 unbound: [1733:0] error: validator:
bad event module_event_reply Jun 27 18:54:00 isp-nscache-01
unbound: [1733:0] fatal error: services/outside_network.c:1504:
serviced_callbacks: pointer whitelist
fptr_whitelist_serviced_query(p->cb) failed
after that we needed to start the service:
Jun 27 19:12:42 isp-nscache-01 unbound-anchor:
/var/lib/unbound/root.key has content Jun 27 19:12:42
isp-nscache-01 unbound-anchor: success: the anchor is ok Jun 27
19:12:42 isp-nscache-01 unbound: [12534:0] warning: did not exit
gracefully last time (1733) Jun 27 19:12:42 isp-nscache-01 unbound:
[12535:0] notice: init module 0: validator Jun 27 19:12:42
isp-nscache-01 unbound: [12535:0] notice: init module 1: iterator
Jun 27 19:12:42 isp-nscache-01 unbound: [12535:0] info: start of
service (unbound 1.4.16).
What happend and what can we do that is not happen again?
There are two failures here. From reading back the code I have not
found the bug. The bad event failure is that the validator complains
that answers from the network happen, which is impossible, and flags
this as an internal error. The second failure is that the contents of
callback->next seems to contain garbage, and an assertion fails
because of the garbage. The second failure happens at the same time
as the first; because otherwise an assertion on line 1519 should fire.
Did you compile with --enable-debug? If not, could you do so? The
additional assertions may provide more information should the bug
reappear, and it is not heavy like enableing high verbosity.
The unbound service is handeling 2044 query's per seconde. We are
running unbound version 1.4.16?
Nice throughput. Do you have any special modules configured (special
python scripts?) ?
Google-ing on this issue doesn't give any result.
This is the first report of this bug. I would like to be able to
reproduce this problem. Do you have query-logging perhaps and can
tell me what queries were active in the last second(s) ?
You seem Dutch, and so is NLnet Labs, in case language makes bug
reports better