This error means that a module was activated 1000 times (or so), which
is impossible or unusual, and should not happen. The query got
SERVFAIL, and it prints this log message.
This should not be happening, usually, lookups fail much earlier.
My quick try says that the name is an NXDOMAIN. And works fine for me.
So SERVFAIL instead of NXDOMAIN is no great loss (for the users), the
failsafe activated and nothing bad happened.
So, the question is why is it looping (the state machine is activated
too often)?
I would like to see a high verbosity trace of a resolution of this name
when it says it is looping; what is that module doing?
Currently we are running Bind 9.5+ on Solaris 10/Sun Netra 240s, and
RHEL 5.5 on Dell 2970s.
I have been trying different flavours of Unbound, both compiling myself,
and using the precompiled flavours from M n M.
I am also trying out Nominum, as well as Bind and Unbound.
I see a marked improvement in QPS over Bind on the Solaris 10 servers in
both Unbound, and Nominum.
On the RHEL servers I see improvement over Bind with Nominum (I get
20,000+ QPS with Bind alone), but degradation when trying Unbound. This
is trying different flavours of Unbound, both compiling myself, and
using the precompiled flavours from M n M.
Does anyone have a unbound.conf with reasonable QPS on a RHEL 5X server
that I can see?
Much Thanks
Bruce
Bruce Hayward, MTS Allstream Inc., (p) 204-958-1983 (e)
bruce.hayward@mtsallstream.com
Is it really necessary to print this email?
MTS ALLSTREAM INC. CONFIDENTIALITY WARNING: This email message is confidential and intended only for the named recipient(s). If you are not the intended recipient, or an agent responsible for delivering it to the intended recipient, or if this message has been sent to you in error, you are hereby notified that any review, use, dissemination, distribution or copying of this message or its contents is strictly prohibited. If you have received this message in error, please notify the sender immediately and delete the original message. If there is an agreement attached with this message, such agreement will not be binding until it is signed by all parties named therein.
Note that the EPEL builds for unbound use --enable-debug, which might make it slower.
I am not sure how big the impact is. The stock unbound.conf is also not aggressively
using most of your memory for a dedicated nameserver, so you might want to tweak the
stock config file. For some hints, see:
I would not use chroot on a dedicated nameserver. All your important stuff
is already inside the chroot, not outside it. Also, with rhel/centos you
should use and trust the SElinux policies - they provide a much better
security context without having to install or link various (sometimes outdated)
binaries or special devices or config files in the chroot. And no surprises
when sending the daemon signals and it possibly not being able to read config
files or includes anymore.
Is the default --enable-debug?
No, it is not the default. So you should be fine. It is still surprising that
you're not outrunning bind though. Are you sure you are comparing similar
configurations, eg with DNSSEC validation and the root key loaded, and perhaps
with DLV?
What version of libevent are you using?
Why are you disabling threads?
Is it finding ssl (you did not add --with-ssl). I've seen a lot of speed differences
with different versions of openssl.
All your important stuff
is already inside the chroot, not outside it.
Assuming there is a bug in unbound (OpenBSD are thinking of adopting it,
so it must be good) meaning that where your important stuff is
matters. Then likely so do all the binaries etc. (if they have not been
removed) that may be used for priviledge elevation. It certainly can't
harm.
(sometimes outdated)
binaries or special devices or config files in the chroot.
Will you look after it or leave it to get dusty.
Is it finding ssl (you did not add --with-ssl). I've seen a lot of
speed differences with different versions of openssl.
Can you remember which one was slow and which was fast?
Assuming there is a bug in unbound (OpenBSD are thinking of adopting it,
so it must be good) meaning that where your important stuff is
matters. Then likely so do all the binaries etc. (if they have not been
removed) that may be used for priviledge elevation. It certainly can't
harm.
What I meant was "the only valuable data on a dedicated nameserver resides
within the chroot, no need to get outside it. Its the compromise of the
nameserver data that matters, not the host. (the host is really just a container)
(sometimes outdated)
binaries or special devices or config files in the chroot.
Will you look after it or leave it to get dusty.
I don't use chroot. So I do not have duplicate/old binaries around.
Is it finding ssl (you did not add --with-ssl). I've seen a lot of
speed differences with different versions of openssl.
Can you remember which one was slow and which was fast?
0.9.[678] was faster then 1.0.0beta, but I think 1.0.0 was fastest.
Not sure, but could this be the kernel version of RHEL5 (which is very
old?) Which has a 'raging herd'-bug thus only your first thread gets to
do all the work? (I have had reports, privately, about such old kernels
showing bad behaviour).
Could it be that if you hard-line the interfaces (not 0.0.0.0 in the
unbound.conf but the exact IP value) then the OS route-selection does
not take time to determine a source interface? (Unless you use
interface-automatic, then this does not make a difference).
You are increasing num-threads, right? (even if you compile without
pthreads, num-threads will still use multiple CPUs).
Bruce Hayward, MTS Allstream Inc., (p) 204-958-1983 (e)
bruce.hayward@mtsallstream.com
Is it really necessary to print this email?
MTS ALLSTREAM INC. CONFIDENTIALITY WARNING: This email message is confidential and intended only for the named recipient(s). If you are not the intended recipient, or an agent responsible for delivering it to the intended recipient, or if this message has been sent to you in error, you are hereby notified that any review, use, dissemination, distribution or copying of this message or its contents is strictly prohibited. If you have received this message in error, please notify the sender immediately and delete the original message. If there is an agreement attached with this message, such agreement will not be binding until it is signed by all parties named therein.
I think that it's pretty well documented in the mail you sent a
link... you setup two unbound instances and mangle the traffic from
set of ip addresses using standard firewall/nat features your
operating system has.
Anyway maybe if you can explain what you are trying to accomplish then
we can propose alternative without views.
On specific resolvers we use bind views to direct those who come from an IP in a specific CIDR to use a specific zone. We have two cases of these views.
We also use views to isolate those that should only use internal zones versus those that should not use internal zones (external customers)
Those that do not come from an IP in a specific CIDR use a global zone.
it should be fairly easy to accomplish both option using DNAT on linux
(or using other translation mechanisms either on the router or on the
end box).
f.e. on linux you can use:
- 10.10.10.1 is the normal address
- 10.10.10.2 is extra address you use to serve internal clients (can
be localhost if NATed on the box)
- 192.168.1.1/32 is the specific CIDR
If you do the NAT on the router before, it has the added benefit of
splitting the load (so you can provide less loaded service to your
customers... etc.)
Unbound doesn't have to know. You'll just configure multiple instances
of unbound (f.e. running on 127.0.0.2 127.0.0.3, etc...) and you'll do
all the logic on the routing level. Of course it's not suitable if you
have complicated setup like many views or overlapping views.
AFAIK the design decision for the unbound was to keep it simple,
efficient, secure and fast. So it doesn't implement everything you'll
find in other DNS software.