Unbound: slow issues.

Hello,

I am running Unbound on FreeBSD, initially 10.3 and now 11, I tried the one on the FreeBSD Base, and now the Port (unbound-1.5.10) compiled with libevent support.

The problem I am experiencing is, from time to time unbound become utterly slow or do not resolve anything, or almost anything.

I did several changes on unbound.conf file and the problem now return about one time a day when just me (one user) is using Unbound as resolver. If a second user begin to using Unbound at same time it became slow as described until it have just one user again.

I opened a post on FreeBSD forum, what have more information:

https://forums.freebsd.org/threads/57493/

I need to add I also tried without success to disable PF firewall looking for any kind of firewall related issue. Also, this is my current unbound.conf:

# This file was generated by local-unbound-setup.
# Modifications will be overwritten.
server:
         port: 53
         username: unbound
         directory: /usr/local/etc/unbound
         chroot: /usr/local/etc/unbound
         pidfile: /usr/local/etc/unbound/unbound.pid
         auto-trust-anchor-file: /usr/local/etc/unbound/root.key
         root-hints: "/usr/local/etc/unbound/root.hints"

         logfile: log/unbound.log
         log-time-ascii: yes
         val-log-level: 2

         do-ip6: no
         do-tcp: yes

         interface: 127.0.0.2
         interface: 192.168.0.220

         access-control: 127.0.0.2/16 allow
         access-control: 192.168.0.0/24 allow

         private-address: 192.168.0.0/24
         private-domain: mydomain.com

         qname-minimisation: yes
         minimal-responses: no
         hide-identity: yes
         hide-version: yes
         do-not-query-localhost: no
         val-clean-additional: yes

         harden-glue: yes
         harden-dnssec-stripped: yes

         unwanted-reply-threshold: 10000

         prefetch: yes
         prefetch-key: yes

         cache-min-ttl: 3600
         cache-max-ttl: 86400

         num-threads: 4
         msg-cache-slabs: 8
         rrset-cache-slabs: 8
         infra-cache-slabs: 8
         key-cache-slabs: 8
         rrset-cache-size: 100m
         msg-cache-size: 50m
         outgoing-range: 8192
         num-queries-per-thread: 4096
         so-rcvbuf: 1m
         so-sndbuf: 1m

         unblock-lan-zones: yes
         insecure-lan-zones: yes

include: /usr/local/etc/unbound/conf.d/*.conf

#forward-zone:
# name: .
# forward-addr: 189.38.95.95
# forward-addr: 189.38.95.96

remote-control:
         control-enable: yes
         control-interface: /usr/local/etc/unbound/unbound.ctl
         control-use-cert: no

Thank you!
Alex.

Please, let me add I am using LibreSSL instead OpenSSL.

Thank you.

For the record, I am also running the latest version of Unbound (1.5.10) on FreeBSD 10.3 with libevent compilation option, and I have no problems whatsoever.

Recommended things to check:

- sysctl limits for network buffers, expecially TCP buffers, since the penetration of DNSSec means that TCP based DNS traffic is increasing.

- in case you use stateful firewall, check limits for max number of states, since you can run out quite easily. Stateless rules for DNS traffic are recommended. Also limit for maximum fragmented packet limits.

- try to monitor your system resource usage, especially memory - do you have enough? does the system swap during peaks in traffic?

- check logs for messages concerning failures to send packets, limits for various resources reached, etc

Also, my servers are constantly bombarded by bogus queries about bogus domains featuring non-responsive authoritative nameservers (targets of some DDOS attack, if I understand it correctly), and such queries can exhaust your resources rapidly, since each unresolved TCP query consumes a portion of memory before it times out. Use the command "unbound-control dump_requestlist" to check what queries are being resolved during the time the server appears to be non-responsive/slow. I had to implement a countermeasure that recognizes these bogus queries and replies with NXDOMAIN RCODE immediately, saving the resolver's memory for legitimate traffic.

I am not saying that there cannot be a problem with the newest version of Unbound, just reporting everything is fine here and trying to provide some tips.

Following the advise I found out, while running “unbound-control dump_requestlist”, what seems to be Unbound trying to resolve IPV6 address instead IPV4.

I do not have IPV6 configured on the server, and have “do-ip6: no” explicitly in unbound.conf.

thread #0

type cl name seconds module status

0 A IN blade.4t2.com. - iterator wait for 217.11.57.53
1 AAAA IN www.edicron.com. 40.960788 iterator wait for 217.160.83.143
2 AAAA IN www.edicron.com.privacychain.ch. 10.932778 iterator wait for 185.148.76.30
3 AAAA IN www.tubetown.de. 6.024901 iterator wait for 88.198.65.232
4 AAAA IN www.eurotubes.com. 11.084678 iterator wait for 208.109.255.22
5 AAAA IN www.tubemonger.com. 10.982738 iterator wait for 69.49.191.246
6 AAAA IN www.diyhifisupply.com. 40.981773 iterator wait for 216.35.197.129
7 AAAA IN www.diyhifisupply.com.privacychain.ch. 10.954016 iterator wait for 185.148.76.30
8 AAAA IN www.hificollective.co.uk. 41.052734 iterator wait for 212.67.202.2
9 AAAA IN www.hificollective.co.uk.privacychain.ch. 11.024719 iterator wait for 46.16.200.135

Thank you.

Hi Alex,

Your requestlist has AAAA queries in it, destined for IPv4 addresses.
The wait times are very long; they look stalled.

Unbound generates AAAA queries internally, but only when do-ip6 is
enabled. You have it disabled.

Your clients must therefore be the ones asking for AAAA records. The
firewall is blocking query type AAAA? Blocking a query type generates
this type of trouble. Unbound cannot tell the difference between this
'random filtering' and a 'down server', and therefore must cease
sending traffic. Also for your type A requests. This causes
resolution to stop.

If you wanted to filter out queries on some sort of 'random' topic;
return a reply with an error code set. Otherwise unbound can only
conclude the server is unreachable.

Best regards, Wouter