DNS lookup failing

I am having a problem with a particular DNS lookup and I am not even sure how to formulate the question, so please bear with me.

My setup is Internet – IPFire with Unbound 1.19.0 – ClearOS7. ClearOS runs a system called Gateway Management which is a branding of AdamNetworks’ Adam:one, a DNS filtering tool.

IPFire is currently running as a recursive resolver but the same problem exists when running as a Caching DNS server. All other boxes are empty on the DNS setup screen in IPFire. SSL and TLS are not being used. I should be able to dig out the configs, if needed.

With Gateway Management running, in ClearOS I can resolve 1024 and 2048 bit domainkeys (1024._domainkey.howitts.co.uk and 2048_domainkey.howitts.co.uk) with nslookup. I can resolve 4096 bit domainkeys using the dig command "dig txt 202403._domainkey.howitts.co.uk" but with nslookup I get:

    [root@server ~]# nslookup -q=txt 202403._domainkey.howitts.co.uk
    Server: 127.0.0.1
    Address: 127.0.0.1#53

    Non-authoritative answer:
    *** Can't find 202403._domainkey.howitts.co.uk: No answer

    Authoritative answers can be found from:
    howitts.co.uk
      origin = achiel.ns.cloudflare.com
      mail addr = dns.cloudflare.com
      serial = 2336336559
      refresh = 10000
      retry = 2400
      expire = 604800
      minimum = 1800

Without Gateway Management on ClearOS 7, it all works. This may lead you to thinking it is Gateway Management but if I change ClearOS’s upstream resolver from IPFire/Unbound to Cloudflare, all lookups work. This leads me to believe Unbound is doing an invalid lookup or giving an invalid response to a particular query formatted by Gateway Managament.

I have pcap files of the working and non-working lookups between ClearOS and IPFire but I don’t know how to interpret them.

Can anyone please help me?

I get the exact same answer from unbound and cloudflare. Note that there is no authoritative answer. You might also have something like systemd-resolved in the way. Systemd-resolved is known to give bogus answers in certain cases, so you might want to disable it while testing.

isildur$ nslookup -q=txt 202403._domainkey.howitts.co.uk
Server: 127.0.0.1
Address: 127.0.0.1#53

Non-authoritative answer:
202403._domainkey.howitts.co.uk text = "v=DKIM1; p=MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAzvkHMnL2cPPUzm6gXBIsaiRMAj7wpajI1cQ3VPsIzIYfBTYgU7xX50tDZnTT4SiE/2+z87gMFSRcFiM9gaejAgV+YFse2AEId2t0+xYXuNwG35dqS6WWlwZY3Rr5IIebcPSeXouuYR3nCdzgK/FCT8Y2vvKTkIDXYsJMQJulxdDAewb9/V7pNZ7J8wky6RRIKnbAEdqO" "zJ9nDEe6wUGXhrMxB2ZjM6sQLJzAgz7VE0Z52eBk/TZgdzJwLxHzeclsWVES3Mw0tdDoUKT2QLd0SB9MsOwFcR6ph/h9VERhMAtjAmUG5YlQQ1bC8nznAwHdY2IP3RUdFZOYcUlv5yPzrRvBAjfi/CmR2zHVQs7gA7b67DaMy67dURWHDhMwqXgWVNrZ4iTInWr1vLEPoNBjppn1GOkXrb+FdNoWnFM5laAEmcFK2Sie5wpzCItFjWs3f3IQZxB" "lzJHIpkvR2ZTMJ5g3DWUU3ZK1rW1kNvGLjZkox7EZH3lFfkyS6lPnfIX5XS5YYeP0RmSAWNaKinCdQq8m8SdjWDIsRJ1aohq/Qx/O1sfQMDdrwetOn6KJqOFg7dcFtvKlRrHQYyujH3dapJ10Err/xAv3iyh9B7x8C6N+qjTMjRoIfPTyLeFnAtUrFQigpj70mbZPaw9AKglDafXvnXJwn8r5/Oq3mjVKKWkCAwEAAQ=="

Authoritative answers can be found from:

isildur$ nslookup -q=txt 202403._domainkey.howitts.co.uk 1.1.1.1
;; Truncated, retrying in TCP mode.
Server: 1.1.1.1
Address: 1.1.1.1#53

Non-authoritative answer:
202403._domainkey.howitts.co.uk text = "v=DKIM1; p=MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAzvkHMnL2cPPUzm6gXBIsaiRMAj7wpajI1cQ3VPsIzIYfBTYgU7xX50tDZnTT4SiE/2+z87gMFSRcFiM9gaejAgV+YFse2AEId2t0+xYXuNwG35dqS6WWlwZY3Rr5IIebcPSeXouuYR3nCdzgK/FCT8Y2vvKTkIDXYsJMQJulxdDAewb9/V7pNZ7J8wky6RRIKnbAEdqO" "zJ9nDEe6wUGXhrMxB2ZjM6sQLJzAgz7VE0Z52eBk/TZgdzJwLxHzeclsWVES3Mw0tdDoUKT2QLd0SB9MsOwFcR6ph/h9VERhMAtjAmUG5YlQQ1bC8nznAwHdY2IP3RUdFZOYcUlv5yPzrRvBAjfi/CmR2zHVQs7gA7b67DaMy67dURWHDhMwqXgWVNrZ4iTInWr1vLEPoNBjppn1GOkXrb+FdNoWnFM5laAEmcFK2Sie5wpzCItFjWs3f3IQZxB" "lzJHIpkvR2ZTMJ5g3DWUU3ZK1rW1kNvGLjZkox7EZH3lFfkyS6lPnfIX5XS5YYeP0RmSAWNaKinCdQq8m8SdjWDIsRJ1aohq/Qx/O1sfQMDdrwetOn6KJqOFg7dcFtvKlRrHQYyujH3dapJ10Err/xAv3iyh9B7x8C6N+qjTMjRoIfPTyLeFnAtUrFQigpj70mbZPaw9AKglDafXvnXJwn8r5/Oq3mjVKKWkCAwEAAQ=="

Authoritative answers can be found from:

Hi,

should be a “answer too long packet truncated” problem.
Try use dig with TCP or edns

IPFire doesn't use systemd. It is still using init.d (SysVInit?). I suspect an unbound config where the supplied configs are missing someting.

Dig works fine, it is just nslookup with Gateway Management on the client for direct lookups.
Unfortunately it also causes "amavisd testkeys" to fail because of an invalid response and I also get a startup warning with amavisd, so I assume it is doing a similar style lookup as nslookup. I've no idea which other programs may be failing.

I've done a bit more digging. With tcpdump, I can see the request coming from ClearOS into Unbound, going out onto the internet and returning with a valid answer to Unbound, but this answer does not then get back from Unbound to ClearOS.

I don't have the knowledge to look at the packet capture and diagnose what is going wrong, either in the original request from ClearOS or in the reply from Unbound to ClearOS.

Your domainkey is too big to fit into udp response. That means
there will be empty udp response with TC bit set on (requesting change
to TCP dns) new request should happen again with TCP.

Make sure tcp dns traffic is allowed for this to work. TCP dns is
really required nowadays. So if you have tooling which doesn't work
with tcp dns, that just means you need to upgrade.

Generally your dkim key can't be that big to work reliably. rsa
sha256 2048 bit key still fit to udp.

I strongly suggest against using nslookup as diagnostic tool, please
use dig.

I am using nslookup as a diagnostic tool, but I also have "amavisd testkey" failing in a similar way which is what really kicked off this investigation:

[root@server ~]# amavisd testkey
TESTING#1 howitts.co.uk: 202403._domainkey.howitts.co.uk => invalid (public key: DNS error: FORMERR)

Yet dig works. Ultimately the goal is to get "amavisd testkey" working so the start up of amavisd is clean. Nslookup is just a way of reproducing the issue.

If I bypass Unbound, and go straight to an upstream resolver, the query works from ClearOS so it is like ClearOS can handle long replies/TCP unless the reply is of a marginal length do going direct to the internet is OK, but going via Unbound is not. It also works from ClearOS without Gateway Management via Unbound, so it is like Gateway Management is inserting something into the query which Unbound does not like, but is OK while going straight to an upstream resolver bypassing Unbound.

Nslookup can resolve your key just fine through unbound 1.19.3.

I even verified with nslookup from c7 machine that querying works.

My guess is that gateway management does something ugly.

Possibly, but Cloudflare DNS can cope with it.

Also 2048._domainkey.howitts.co.uk works with Gateway Management which suggests it may be something to do with long response handling. This is where I think packet sniffing would help but I don't know how to interpret the responses. I have pcap files with and without Gateway Management active and I can see in tcpdump there is something showing with [1au] suggesting there is extra data in the DNS request with Gateway Management. At that point my knowledge fails.