We have an installation, where NSD (version 4.1.19) acts as a hidden master for the public DNS servers. NSD has only one large zone configured and the zone is periodically signed every 20 minutes and after each re-signing, "nsd-control reload <zone>" is given so that the NSD process reloads the new zone from the zonefile and notifies the slaves. However, we noticed that the database memory usage increases after every reload, finally resulting in memory allocation failure. We stat the memory usage by running "nsd-control stats_noreset" every minute and in the graph [1] one can see the increase in size.db.mem after each reload. We don't see similar behaviour with for example xfrd process memory usage.
We have worked around the issue by restarting the NSD process periodically, but do you have any ideas about the possible root cause and a more long term solution?
This is certainly a problem, and I'm sure the developers will be happy
to investigate it with you.
However, I'd like to suggest that you don't use the database mode. If
you set:
database: ""
in your nsd.conf, then nsd will load the zone from the zonefile into
RAM, and won't bother compiling the nsd.db file. You don't really gain
anything with the database file, and I've been advocating for the
database mode to be dropped completely in future versions of nsd.
By the way, if you or the developers find the problem, please do let us
know here, because I'm also curious about it.
Actually I forgot to mention it in my first message, but we do have set the database to empty value in configuration.
For us restarting NSD every now and then is not a very big problem, as this instance is only a hidden master, but naturally a more elegant solution would be very welcome.
It is a memory leak, thank you for the report! Fixed it, code is copied
below and in the code repository. It happens when unknown RR formatted
RRs are read from zonefile.
It is a memory leak, thank you for the report! Fixed it, code is copied
below and in the code repository. It happens when unknown RR formatted
RRs are read from zonefile.
Hi Wouter, thanks for the quick fix!
So Antti, which unusual RRs do you have in your zone?
That's something I also wonder, as the zone in question contains only very usual RRs, mainly delegation NS records. There are quite a lot of IDN names but that should probably be business as usual.
I will try to look at the zone contents if I find anything. Wouter, anything particular in mind that I should be searching for?
RRs printed in the \#length hexadecimals format cause the problem that I
found. Regardless if the type was known to NSD, it was the text
formatting in the parser that leaked.
If the problem is still there (after using patch?) then it maybe
possible to reproduce with (a smaller subset of) the zone?
It is a memory leak, thank you for the report! Fixed it, code is copied
below and in the code repository. It happens when unknown RR formatted
RRs are read from zonefile.
Hi Wouter, thanks for the quick fix!
So Antti, which unusual RRs do you have in your zone?
That's something I also wonder, as the zone in question contains only very usual
RRs, mainly delegation NS records. There are quite a lot of IDN names but that
should probably be business as usual.
I will try to look at the zone contents if I find anything. Wouter, anything
particular in mind that I should be searching for?
RRs printed in the \#length hexadecimals format cause the problem that I
found. Regardless if the type was known to NSD, it was the text
formatting in the parser that leaked.
If the problem is still there (after using patch?) then it maybe
possible to reproduce with (a smaller subset of) the zone?
We rebuilt the package with the patch and unfortunately it seems that the memory usage still increases after each reload like it did before the patch. We'll try to debug this further.
Found the issue and fixed it in the code repository (with offlist
debug). It was NSEC3 processing and previously I fixed an NSEC signed zone.