I'm running unbound as resolver on routers for some years now, with some
local domain overrides. On the router, bind912 is installed as secondary
authoritative server for the local zones serving on port 5053, so the
unbound config has "do-not-query-localhost: no" and the appropriate
forward zones to 127.0.0.1@5053.
This setup worked like a charm up to unbound 1.7.3. After upgrading to
1.8.1 /1.8.2, the unbound process will stop resolving local domains from
the override after some minutes. Older requests are served correctly
from cache, but newer ones are queried from upstream, which fails of
course with an unknown TLD. flushing the local domain, all following
requests will go upstream. Nothing is logged.
Restarting the unbound process will heal the situation for some minutes,
but then the problem rises again. Replacing the unbound binary with the
1.7.3 version fixes the problem.
The routers are opnSense AMD64 (18.1 had the 1.7.3 unbound, 18.7 has
unbound 1.8.1 and 1.8.2 in the latest version)
I'm running unbound as resolver on routers for some years now, with some
local domain overrides. On the router, bind912 is installed as secondary
authoritative server for the local zones serving on port 5053, so the
unbound config has "do-not-query-localhost: no" and the appropriate
forward zones to 127.0.0.1@5053.
This setup worked like a charm up to unbound 1.7.3. After upgrading to
1.8.1 /1.8.2, the unbound process will stop resolving local domains from
the override after some minutes. Older requests are served correctly
from cache, but newer ones are queried from upstream, which fails of
course with an unknown TLD. flushing the local domain, all following
requests will go upstream. Nothing is logged.
Restarting the unbound process will heal the situation for some minutes,
but then the problem rises again. Replacing the unbound binary with the
1.7.3 version fixes the problem.
I don't see what change in 1.8.0 would create the problem. New defaults
in 1.8.0 enable harden-below-nxdomain and minimal-responses. Turning
those off changes behaviour?
Can you run with verbosity high (4 or 5) and then you would have logs.
That should explain why it is going to the upstream.
Also your config. The forward-zone to the local bind server, is not
called a local-zone (local-zone is a term coined for static data or
special processing that is performed before the cache is checked or
resolution is performed by Unbound). Disable forward-first, if it was
enabled. Disable stub-prime (if it was a stub-zone, really, that would
have picked up the NS records, likely from qname minimisation(?) and
then go to those upstream servers instead of the localhost ?).
I'm running unbound as resolver on routers for some years now, with some
local domain overrides. On the router, bind912 is installed as secondary
authoritative server for the local zones serving on port 5053, so the
unbound config has "do-not-query-localhost: no" and the appropriate
forward zones to 127.0.0.1@5053.
This setup worked like a charm up to unbound 1.7.3. After upgrading to
1.8.1 /1.8.2, the unbound process will stop resolving local domains from
the override after some minutes. Older requests are served correctly
from cache, but newer ones are queried from upstream, which fails of
course with an unknown TLD. flushing the local domain, all following
requests will go upstream. Nothing is logged.
Restarting the unbound process will heal the situation for some minutes,
but then the problem rises again. Replacing the unbound binary with the
1.7.3 version fixes the problem.
I don't see what change in 1.8.0 would create the problem. New defaults
in 1.8.0 enable harden-below-nxdomain and minimal-responses. Turning
those off changes behaviour?
See below...
Can you run with verbosity high (4 or 5) and then you would have logs.
That should explain why it is going to the upstream.
I did so, and brought the firewall to its knees. The CPU went to nearly
800% (8 core C3758) and would stay there. With verbosity down to 3,
unbound will use a low one-digit percentage.
Also your config. The forward-zone to the local bind server, is not
called a local-zone (local-zone is a term coined for static data or
special processing that is performed before the cache is checked or
resolution is performed by Unbound). Disable forward-first, if it was
enabled. Disable stub-prime (if it was a stub-zone, really, that would
have picked up the NS records, likely from qname minimisation(?) and
then go to those upstream servers instead of the localhost ?).
The config is mostly generated by OpnSense, didn't touch that.
After I failed that miserably to enable proper logging, I tried some
stuff (ip6), and found that actually just setting harden-below-nxdomain
to no will prevent the failure.