Memory usage during reload

Hello,

I've noticed a strange memory usage peak during "nsdc reload". We have 65000 signed zones resulting in 250MB nsd.db file, NSD 3.2.4 runs on 64-bit Gentoo Linux 2.6.27 with 4GB RAM. In-memory footprint of the namedb database is cca 500MB. Normally, nsd consumes about 1GB
of memory -- one half is used by xfrd daemon and the other half is shared among main and children processes, according to /proc/pid/smaps. During reload, I would expect that the maximal memory usage should not exceed 1.5GB regardless of the number of children (500MB for xfrd, 500MB for the old database shared by old processes, and 500MB for the reloaded database shared by the new processes). But in practice, there is a very short peak when memory usage goes far beyond 1.5GB. And even worse, the usage peak depends on the number of children. For server-count=1, maximal usage is around 2GB. With server-count=4, the usage goes over 3GB. Also note that the usage immediately drops back to 1GB when the reload is done.

After short investigation, I've realized that it is probably caused by memory deallocation during nsd process shutdown. Looking into the code, every server_shutdown(nsd) is preceded by namedb_close(nsd->db) that deallocates entire namedb database. I guess that the region_destroy code causes thousands of shared memory pages to be duplicated to private ones for every child process by the kernel copy-on-write mechanism. To prove my idea, I removed namedb_close(nsd->db) lines occurring before server_shutdown(nsd) calls. And voila... peak memory usage during reload reached only the expected 1.5GB.

I know that memory cleanup before process termination is generally a good coding practice, but in this particular case it should be omitted, at least for non-debug builds.

Or do I miss something?

Best regards
Martin Svec

Hello Martin,

I think you are right. I also believe that not cleanup up the db does
not cause harm in this situation. I have (c)omitted the namedb_close
calls before server_shutdown, like you suggested.

Thanks for the report,

Matthijs Mekking
NLnet Labs

Martin Švec wrote: