"warning: diff: RR <madrid., RRSIG> rdata element 0 differs from RR num 0 rdata (rdata data)

I get strange error reports in my nsd log:

[2016-02-22 06:01:38.256] nsd[1466]: info: xfrd: zone . committed "received update to serial 2016
022200 at 2016-02-22T06:01:38 from 2001:559:8000::7"
[2016-02-22 06:01:38.257] nsd[1467]: warning: diff: RR <madrid., RRSIG> rdata element 0 differs f
rom RR num 0 rdata (rdata data)
[2016-02-22 06:01:38.257] nsd[1467]: warning: diff: RR <madrid., RRSIG> rdata element 4 differs f
rom RR num 1 rdata (rdata data)
[2016-02-22 06:01:38.257] nsd[1467]: warning: diff: RR <madrid., RRSIG> does not exist
[2016-02-22 06:01:38.257] nsd[1467]: warning: diff: RR <edu., RRSIG> rdata element 0 differs from
RR num 0 rdata (rdata data)
[2016-02-22 06:01:38.257] nsd[1467]: warning: diff: RR <edu., RRSIG> rdata element 4 differs from
RR num 1 rdata (rdata data)
[2016-02-22 06:01:38.257] nsd[1467]: warning: diff: RR <edu., RRSIG> does not exist
[Tons of similar messages]

[2016-02-22 06:01:38.306] nsd[1467]: warning: diff: RR <., RRSIG> does not exist
[2016-02-22 06:01:38.306] nsd[1467]: warning: diff: RR <edeka., RRSIG> rdata element 4 differs from RR num 0 rdata (rdata data)
[2016-02-22 06:01:38.306] nsd[1467]: warning: diff: RR <edeka., RRSIG> rdata element 0 differs from RR num 1 rdata (rdata data)
[2016-02-22 06:01:38.306] nsd[1467]: warning: diff: RR <edeka., RRSIG> does not exist
[2016-02-22 06:01:38.318] nsd[1467]: error: Failed to apply IXFR cleanly (deletes nonexistent RRs, adds existing RRs). Zone . contents is different from master, starting AXFR. Transfer received update to serial 2016022200 at 2016-02-22T06:01:38 from 2001:559:8000::7
[2016-02-22 06:01:38.318] nsd[1467]: info: zone . received update to serial 2016022200 at 2016-02-22T06:01:38 from 2001:559:8000::7 of 754764 bytes in 1.15702 seconds
[2016-02-22 06:01:38.320] nsd[1466]: error: xfrd: zone .: soa serial 2016022200 update failed, restarting transfer (notified zone)

This machine is a root name server (for the DNS Yeti project). The
problem appeared "recently" (for some value of "recent").

The main consequence is the filling of the logs but it may create
other problems. (The later zone transfer worked:

[2016-02-22 06:01:47.389] nsd[1466]: info: xfrd: zone . committed "received update to serial 2016022200 at 2016-02-22T06:01:47 from 240c:f:1:22::7"
[2016-02-22 06:01:47.466] nsd[1467]: info: zone . received update to serial 2016022200 at 2016-02-22T06:01:47 from 240c:f:1:22::7 of 824909 bytes in 7.15648 seconds
[2016-02-22 06:01:47.469] nsd[1466]: info: zone . serial 2016022101 is updated to 2016022200.

NSD version 4.1.7

Hi Stephane,

I get strange error reports in my nsd log:

It is exactly what it says, the IXFR is bad, and seems to apply to a
different zone data. It is like you have different masters that are
signing in a dual-sign (failover) situation, but they have diverged.
This IXFR was received from the other master, who uses a different
baseline version (especially the signatures) and thus the diff does
not apply cleanly.

NSD has a config option in nsd.conf (for request-xfr) 'AXFR' that
disables IXFR requests towards that master. This is a short term fix.

Likely the actual flaw is the master servers that are out-of-sync.
Perhaps this is even visible straight away - AXFR from different
masters gives different RRSIG data. Or perhaps it is only for the
affected RRs, NSD tries to print exactly what went wrong.

The reason for the verbose output is we get this when a master serves
bad IXFRs (couldn't keep their idea of serial xyz the same), and we
then prove what happened. I.e. software on the other side, or perhaps
operator loading a different file with the same serial number. We get
this often enough that it falls back to AXFR for these cases (and thus
recovers nicely until the operator makes a serial number mistake
again). But I think in your case, more likely some sort of multiple
signers that get out of synchrony?

Best regards, Wouter

a message of 107 lines which said:

It is like you have different masters that are signing in a
dual-sign (failover) situation, but they have diverged. This IXFR
was received from the other master, who uses a different baseline
version (especially the signatures) and thus the diff does not apply
cleanly.

Exactly: we are testing a multi-master setup so you are certainly
right.

a message of 107 lines which said:

NSD has a config option in nsd.conf (for request-xfr) 'AXFR' that
disables IXFR requests towards that master. This is a short term
fix.

Temporary workaround applied, while the discuss the multi-master setup
of Yeti.