The amount of reduction depends on the zone contents.
We observed a reduction across our smaller zones and nameservers, however we also observed the opposite: the nameservers serving our largest zones now consume more memory on NSD 4.14.0. Is this to be expected in some cases?
The absolute figures here are trivial and are of no practical concern to us. We are not a registry, and so even our largest zones are tiny in comparison to some of your other users. I nevertheless wondered whether this behaviour was expected. If unexpected, and if the adverse outcome is liable to scale on some very large zones, maybe this report would be of interest to you.
Unfortunately, I am unable to share our zone data for it includes confidential information that is not served on public networks, though I am happy to share what information I can. I will focus on one zone that was observed to consume more memory on NSD 4.14.0.
nsd_size_db_in_mem_bytes (size.db.mem) went from ~6.49 Mi on NSD 4.13.0 to ~8.80 Mi on NSD 4.14.0. There are about 9K RRs in the zone. Our zones are specifically designed to serve RFC 6763 DNS-SD data (and nothing else). We never load more than one zone into each NSD server. We build with --enable-packed and --disable-radix-tree; our complete set of configure flags can be found at the bottom of this message. The positive delta is reproducible on Linux and macOS.
Here is the aggregate composition of that zone by RR type:
Thanks for looking into your findings and reporting!
I have been testing the new RDATA storage approach with these zones: .lol, .nl, .se, .net, .org and .com, which had with your configure options (–enable-packed and --disable-radix-tree) compared to NSD 4.13.0 compiled with the same options, the following reductions:
.lol: 29.3%
.nl: 26.6%
.se: 23.7%
.net: 6.6%
.org: 6.5%
.com: 6.1%.
But none of these zones have PTR records, so I am afraid I missed this! I will look into this now and report back.
Thanks again for reporting. I was about to post a blog post about my testing results, but will postpone that until we’ve figured out what is going on and possibly how to remedy it. You saved me from posting a “all good news” story, which clearly needs to be nuanced.
Just a quick update on the issue. I can reproduce.
The issue is in our region-allocator. Memory recycling (which is happening when resizing RRsets when adding new RRs to them) is not performing too well when there are a relatively few, but uniquely large RRsets, as in RRsets with many RRs. NSD before 4.14.0 already didn’t do too well under these circumstances, but our new release can do worse. So it is the combination of RRsets with a uniquely large number of RRs; that is uniquely large among all the zones served in a NSD instance.
I do think I can remedy this, but there is no quick and easy fix.
Your test suites are surely much better than ours, so I am yet to test for nameserver function. Let me know if you need me to put a Linux build into a traffic path. We are in change freeze for the upcoming holiday period, but I might be able to arrange a test in a pre-production environment.