NSD 3: slave too agressive

I test NSD on a machine which is secondary for ~ 500 zones, many of
them lame (for instance, the master no longer replies).

When it occurs, nsd keeps asking and asking again and logs every
attempt:

[1166301627] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301627] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301628] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301638] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301638] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301639] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301649] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301649] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301650] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301662] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301662] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3
[1166301663] nsd[21603]: error: xfrd: zone 202.16.192.in-addr.arpa received error code REFUSED from 137.39.1.3

Three requests, every ten seconds, and it does not stop even after
several hours! Is it normal?

Unfortunately yes. NSD handles missing secondaries/primaries pretty badly. I was
hoping nsd3, which i haven't tested yet, would handle this better.

Paul

Hi,

I've fixed it, so that for secondary zones without any data, it will do
exponential backoff from about 10 seconds to about 4 hours maximum. It
is randomized to approximately those numbers.

For zones that have content (thus SOA) or had ever had content, the
retry value from the SOA is used. Without exponential backoff - as per
the specs in rfc1034.

The fix is slated for 3.0.4.

Best regards,
   Wouter

Paul Wouters wrote: