Parent Child Disagreement Redux

I believe I'm still having problems related to the
parent-child-disagreement problem.

Running the latest unbound from svn (revision 2200). I've tried using
fork mode and threaded+libevent, same outcome, getting SERVFAILs:

$ dig @127.0.0.1 www.us.hsbc.com. in any +trace

; <<>> DiG 9.6.2-P2 <<>> @127.0.0.1 www.us.hsbc.com. in any +trace
; (1 server found)
;; global options: +cmd
. 518400 IN NS b.root-servers.net.
. 518400 IN NS h.root-servers.net.
. 518400 IN NS k.root-servers.net.
. 518400 IN NS m.root-servers.net.
. 518400 IN NS l.root-servers.net.
. 518400 IN NS c.root-servers.net.
. 518400 IN NS e.root-servers.net.
. 518400 IN NS g.root-servers.net.
. 518400 IN NS d.root-servers.net.
. 518400 IN NS j.root-servers.net.
. 518400 IN NS f.root-servers.net.
. 518400 IN NS a.root-servers.net.
. 518400 IN NS i.root-servers.net.
;; Received 500 bytes from 127.0.0.1#53(127.0.0.1) in 56 ms

com. 172800 IN NS b.gtld-servers.net.
com. 172800 IN NS m.gtld-servers.net.
com. 172800 IN NS k.gtld-servers.net.
com. 172800 IN NS i.gtld-servers.net.
com. 172800 IN NS e.gtld-servers.net.
com. 172800 IN NS j.gtld-servers.net.
com. 172800 IN NS c.gtld-servers.net.
com. 172800 IN NS h.gtld-servers.net.
com. 172800 IN NS d.gtld-servers.net.
com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS l.gtld-servers.net.
com. 172800 IN NS g.gtld-servers.net.
com. 172800 IN NS f.gtld-servers.net.
;; Received 493 bytes from 192.5.5.241#53(f.root-servers.net) in 80 ms

hsbc.com. 172800 IN NS ns3.hsbc.com.
hsbc.com. 172800 IN NS ns4.hsbc.com.
;; Received 101 bytes from 192.55.83.30#53(m.gtld-servers.net) in 161 ms

us.hsbc.com. 7200 IN NS auth61.ns.uu.net.
us.hsbc.com. 7200 IN NS auth00.ns.uu.net.
;; Received 84 bytes from 193.108.73.36#53(ns3.hsbc.com) in 2358 ms

www.us.hsbc.com. 600 IN NS vhprdgss01.hsbc.com.
www.us.hsbc.com. 600 IN NS phprdgss01.hsbc.com.
;; Received 83 bytes from 198.6.1.182#53(auth61.ns.uu.net) in 44 ms

www.us.hsbc.com. 20 IN A 161.113.4.6
;; Received 49 bytes from 161.113.7.248#53(vhprdgss01.hsbc.com) in 39 ms

$ dig @127.0.0.1 www.us.hsbc.com. in any

; <<>> DiG 9.6.2-P2 <<>> @127.0.0.1 www.us.hsbc.com. in any
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 22499
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.us.hsbc.com. IN ANY

;; Query time: 4059 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Jul 20 13:41:58 2010
;; MSG SIZE rcvd: 33

Would debug output help? If so, what is the recommended verbosity setting?

Thanks!
-Dustin

* Dustin Marquess:

I believe I'm still having problems related to the
parent-child-disagreement problem.

Running the latest unbound from svn (revision 2200). I've tried using
fork mode and threaded+libevent, same outcome, getting SERVFAILs:

$ dig @127.0.0.1 www.us.hsbc.com. in any +trace

Can you reproduce this without QTYPE=ANY?

Yes. A & CNAME both SERVFAIL:

$ dig @127.0.0.1 www.us.hsbc.com. in a

; <<>> DiG 9.6.2-P2 <<>> @127.0.0.1 www.us.hsbc.com. in a
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 48625
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.us.hsbc.com. IN A

;; Query time: 2979 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Jul 20 15:02:42 2010
;; MSG SIZE rcvd: 33

$ dig @127.0.0.1 www.us.hsbc.com. in cname

; <<>> DiG 9.6.2-P2 <<>> @127.0.0.1 www.us.hsbc.com. in cname
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 47317
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.us.hsbc.com. IN CNAME

;; Query time: 1061 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Jul 20 15:02:51 2010
;; MSG SIZE rcvd: 33

-Dustin

Florian Weimer wrote:

* Dustin Marquess:

> I believe I'm still having problems related to the
> parent-child-disagreement problem.
>
> Running the latest unbound from svn (revision 2200). I've tried using
> fork mode and threaded+libevent, same outcome, getting SERVFAILs:
>
> $ dig @127.0.0.1 www.us.hsbc.com. in any +trace

Can you reproduce this without QTYPE=ANY?

what am i missing? the only problem i see is that the nameservers for
www.us.hsbc.com (broken loadbalancers?) don't respond to qtype=NS and
some other qtypes, e.g. MX and TXT. but they do send NOERROR for
qtype=AAAA.

What's missing is the A records :).

Eg:

$ host www.us.hsbc.com. 127.0.0.1
;; connection timed out; no servers could be reached

Works fine using dnscache & MaraDNS.

-Dustin

Dustin Marquess wrote:

What's missing is the A records :).

Eg:

$ host www.us.hsbc.com. 127.0.0.1
;; connection timed out; no servers could be reached

no, the nameservers respond fine to qtype=A, unless you are suggesting
they intermittently fail?

    www.us.hsbc.com. 600 IN NS phprdgss01.hsbc.com.
    www.us.hsbc.com. 600 IN NS vhprdgss01.hsbc.com.
    ;; Received 83 bytes from 198.6.1.182#53(auth61.ns.uu.net) in 67 ms

    $ dig +short +norec @vhprdgss01.hsbc.com. a www.us.hsbc.com
    161.113.4.6
    $ dig +short +norec @phprdgss01.hsbc.com. a www.us.hsbc.com
    161.113.4.6

fwiw unbound 1.4.5 resolves www.us.hsbc.com/A for me, and i've never
noticed a problem looking up that domain.

Dustin Marquess wrote:

What's missing is the A records :).

Eg:

$ host www.us.hsbc.com. 127.0.0.1
;; connection timed out; no servers could be reached

no, the nameservers respond fine to qtype=A, unless you are suggesting
they intermittently fail?

www.us.hsbc.com. 600 IN NS phprdgss01.hsbc.com.
www.us.hsbc.com. 600 IN NS vhprdgss01.hsbc.com.
;; Received 83 bytes from 198.6.1.182#53(auth61.ns.uu.net) in 67 ms

$ dig +short +norec @vhprdgss01.hsbc.com. a www.us.hsbc.com
161.113.4.6
$ dig +short +norec @phprdgss01.hsbc.com. a www.us.hsbc.com
161.113.4.6

Yes, the NS servers for the domain return the A records. However a
client querying against the unbound cache never receives them:

$ dig @127.0.0.1 in a www.us.hsbc.com.

; <<>> DiG 9.6.2-P2 <<>> @127.0.0.1 in a www.us.hsbc.com.
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 26467
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.us.hsbc.com. IN A

;; Query time: 3566 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Jul 20 16:09:33 2010
;; MSG SIZE rcvd: 33

fwiw unbound 1.4.5 resolves www.us.hsbc.com/A for me, and i've never
noticed a problem looking up that domain.

us.hsbc.com works fine in unbound, www.us.hsbc.com does not.

Both work fine using dnscache or MaraDNS as the recursive DNS server.
Using Unbound under both FreeBSD & NetBSD, however, causes the above
behavior.

-Dustin

Not responding to NS records would be bad enough. To make it worse, the
nameservers only respond to all-lowercase queries.

$ dig +short www.us.hsbc.com @161.113.3.248
161.113.4.6

$ dig +short wwW.Us.hsBc.cOm @161.113.3.248
;; connection timed out; no servers could be reached

That breaks 0x20 randomization ("use-caps-for-id: yes").

Hauke.

AH-HA!

I had use-caps-for-id = yes set.

Setting it to no seems to have 'fixed' it.

Thanks for that :).

-Dustin