Stub-prime unexpected behavior

I internally override an externally visible domain to be able to give
different answers with a config like:

stub-zone:
     name: "example.com"
     stub-addr: 10.1.2.3
     stub-addr: 10.1.2.4
     stub-prime: yes

I recently upgraded from Unbound 1.4.4 to 1.4.19 and after running for a few
hours was noticing that queries for foo.bar.example.com (an internal-only
name) started returning NXDOMAIN. When this happens, "dig -t ns
example.com" shows the external NS records.

It turned out that I had poorly configured a subdomain of example.com with a
lame delegation to itself, and Unbound would eventually stop talking to
10.1.2.3 and 10.1.2.4 because of this, claiming "debug: No more query
targets, attempting last resort". It then it does what the documentation
for "stub-first" claims, even though I don't have it enabled, and goes and
looks up the nameservers for "example.com" starting with the roots. Unfortunately, this means it starts answering queries using the external
nameservers instead of the internal ones.

Is this the expected behavior of stub-prime? It seems to be a change from
how it was behaving in Unbound 1.4.4.

Disabling stub-prime seems to fix this.

See the sanitized relevant snippet of unbound-host output below. I can send
a larger unsanitized chunk privately if this isn't enough.

Thanks!

                                     -- Aaron

Hi Aaron,

I internally override an externally visible domain to be able to
give different answers with a config like:

stub-zone: name: "example.com" stub-addr: 10.1.2.3 stub-addr:
10.1.2.4 stub-prime: yes

I recently upgraded from Unbound 1.4.4 to 1.4.19 and after running
for a few hours was noticing that queries for foo.bar.example.com
(an internal-only name) started returning NXDOMAIN. When this
happens, "dig -t ns example.com" shows the external NS records.

It turned out that I had poorly configured a subdomain of
example.com with a lame delegation to itself, and Unbound would
eventually stop talking to 10.1.2.3 and 10.1.2.4 because of this,
claiming "debug: No more query targets, attempting last resort".
It then it does what the documentation for "stub-first" claims,
even though I don't have it enabled, and goes and looks up the
nameservers for "example.com" starting with the roots.
Unfortunately, this means it starts answering queries using the
external nameservers instead of the internal ones.

Is this the expected behavior of stub-prime? It seems to be a
change from how it was behaving in Unbound 1.4.4.

Not for stub-prime, the newly introduced behaviour for 'normal
referrals' is to check at the parent as a last resort to get
information. When you add a stub-zone with stub-prime yes, then this
also activates.

Disabling stub-prime seems to fix this.

Because it does not failover to the parent as a last resort.

See the sanitized relevant snippet of unbound-host output below. I
can send a larger unsanitized chunk privately if this isn't
enough.

Not sure if I should fix this, or not. Is it merely unexpected, or
undesirable?

Best regards,
   Wouter

Not for stub-prime, the newly introduced behaviour for 'normal
referrals' is to check at the parent as a last resort to get
information. When you add a stub-zone with stub-prime yes, then this
also activates.

Based on the current documentation, I would expect this to only activate
with stub-first.

Stub-zone's stated purpose is "to configure authoritative data to be used by
the resolver that cannot be accessed using the public internet servers. This is useful for company-local data or private zones." Thus I expect it to
never try to start at the roots, otherwise there's no point to using
stub-zone for private data.

Stub-prime claims that it "performs NS set priming, which is similar to root
hints, where it starts using the list of nameservers currently published by
the zone. Thus, if the hint list is slightly outdated, the resolver picks
up a correct list online." I had assumed the correct list would only come
from the listed stub-host or stub-addr, not that this implies the fallback
behavior described under stub-first.

Because it does not failover to the parent as a last resort.

How does "stub-prime: yes, stub-first: no" differ from "stub-prime: yes,
stub-first: yes", if the former falls back to using the roots? Doesn't
"stub-first: no" tell it to not do that?

Not sure if I should fix this, or not. Is it merely unexpected, or
undesirable?

I would like "stub-prime: yes, stub-first: no" to still mean "I'm giving you
a private zone that can't be reached from the roots, please don't try and
ask them." If that's not what it means, I think updating the stub-prime
documentation to include the fallback behavior is necessary.

Thanks for looking at this,

                                     -- Aaron

Hi Aaron,

Not for stub-prime, the newly introduced behaviour for 'normal
referrals' is to check at the parent as a last resort to get
information. When you add a stub-zone with stub-prime yes, then
this also activates.

Because it does not failover to the parent as a last resort.

How does "stub-prime: yes, stub-first: no" differ from
"stub-prime: yes, stub-first: yes", if the former falls back to
using the roots? Doesn't "stub-first: no" tell it to not do that?

Not sure if I should fix this, or not. Is it merely unexpected,
or undesirable?

The issue is resolved in the svn. It now does not go to the
servers above configured stuff, if the configured servers fail. This
happens for stub-zone and forward-zone.

The actual bug was that it forgot the setting for stub-first when it
did the stub-prime.

Best regards,
   Wouter