[PATCH] Round-robin doesn't work with some Windows clients

Hi,
We have found a problem in Unbound Round-Robin algorithm.
One of our clients has a DNS record with two IP addresses.
When Windows clients try to resolve DNS name, they receive these two IPs
always in the same order. OpenBSD clients have no problem (= the order
changes).
tcpdump shows that Windows sends incrementing query ID, and OpenBSD uses
random ID for each query.
Unbound uses the following construct to generate "random" number to reorder
RRs in the answer:
util/data/msgencode.c:
rr_offset = RRSET_ROUNDROBIN?id:0;

The problem here is that "id" is read from wire without converting it to Host
byte order. So that only high bits of ID change. When there are only two RRs
in the reply, this code in msgencode.c will always generate the same
sequence:
j = (i + rr_offset) % data->count;

The problem can be easily solved by converting query ID to Host byte order:

/* roundrobin offset. using query id for random number */
-rr_offset = RRSET_ROUNDROBIN?id:0;
+rr_offset = RRSET_ROUNDROBIN ? ntohs(id) : 0;

We have tested this with incrementing query IDs and it works fine.

However it may be better to change the way "id" is read from wire (reading +
converting to Host byte order, and converting back when writing back to
wire).

Hi Ilya,

Hi, We have found a problem in Unbound Round-Robin algorithm. One
of our clients has a DNS record with two IP addresses. When Windows
clients try to resolve DNS name, they receive these two IPs always
in the same order. OpenBSD clients have no problem (= the order
changes). tcpdump shows that Windows sends incrementing query ID,
and OpenBSD uses random ID for each query. Unbound uses the
following construct to generate "random" number to reorder RRs in
the answer: util/data/msgencode.c: rr_offset =
RRSET_ROUNDROBIN?id:0;

The problem here is that "id" is read from wire without converting
it to Host byte order. So that only high bits of ID change. When
there are only two RRs in the reply, this code in msgencode.c will
always generate the same sequence: j = (i + rr_offset) %
data->count;

The problem can be easily solved by converting query ID to Host
byte order:

/* roundrobin offset. using query id for random number */
-rr_offset = RRSET_ROUNDROBIN?id:0; +rr_offset = RRSET_ROUNDROBIN ?
ntohs(id) : 0;

We have tested this with incrementing query IDs and it works fine.

Thank you for this patch, I have committed it to the source code.

Best regards,
   Wouter