TLS upstream connections get closed despite keepalive

Hello,

I'm running unbound 1.16.1 on Linux 5.15.55, configured to forward everything over TCP TLS connections.
Despite keepalive being enabled, I can see that the connections get closed early.
Note that the server is not busy at all.

Here are the relevant bits of the configuration:

forward-zone:
name: "."
forward-tls-upstream: yes
forward-addr: 1.1.1.1@853#cloudflare-dns.com
forward-addr: 1.0.0.1@853#cloudflare-dns.com
forward-addr: 8.8.8.8@853#dns.google
forward-addr: 8.8.4.4@853#dns.google

max-reuse-tcp-queries: 2000
tcp-idle-timeout: 9000000
edns-tcp-keepalive: yes
edns-tcp-keepalive-timeout: 9000000
num-threads: 1

Packet captures show that after as few as ~30 seconds, unbound sends a FIN+ACK. Sometimes it sends a couple more RST packets which doesn't seem right either.

This, combined with the behavior that unbound wants to open connections to all upstream servers instead of reusing existing connections, it will constantly open new connections (and to make matters worse it does not seem to do that in the background but synchronously with incoming queries, blocking them) leads to many queries being needlessly delayed by about 80 to 180ms.

Did I do something wrong? How can I fix this?

Hi,

You are missing "tcp-reuse-timeout:" from your configuration.
That is the value that will keep the persistent connections open (at least from Unbound's side).

"max-reuse-tcp-queries:" can cause a persistent connection to close prematurely before the timeout is reached.

Timeouts for expecting an answer and responses to queries we did not ask, can also cause the connection to close prematurely.

The connection can also be closed for the other side.

I hope this is helpful.

Best regards,
-- George