High memory consumption for small AXFR

Hello!

I use NSD 4.7.0 self compiled:

Configure line: --build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-option-checking --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run --disable-maintainer-mode --disable-dependency-tracking --with-configdir=/etc/nsd --with-nsd_conf_file=/etc/nsd/nsd.conf --with-pidfile=/run/nsd/nsd.pid --with-dbfile=/var/lib/nsd/nsd.db --with-zonesdir=/etc/nsd --with-xfrdfile=/var/lib/nsd/xfrd.state --disable-largefile --disable-recvmmsg --enable-root-server --enable-mmap --enable-ratelimit --enable-zone-stats --enable-systemd --enable-checking --enable-dnstap --disable-radix-tree --enable-packed

Event loop: libevent 2.1.12-stable (uses epoll)

Linked with OpenSSL 3.0.2 15 Mar 2022

I tested XFR with a big “test.” zone, with server-count=1.

Zone test. is unsigned.

The server had plenty of other zones plus the test. zone. Ever zones has a dedicated NSD process. The server has 40GB RAM. Without .test the server has ~20GB RAM consumption.

Testing:

  1. AXFR of test. zone with 5RR → Memory consumption stable at 20GB

  2. AXFR-style IXFR of test. zone with 50mio RRs (only NS records) → memory consumption increased by ~14GB RAM to 34GB RAM

15:05:46 nsd-trial[635021]: xfrd: zone test committed “received update to serial 1690380825 at 2023-07-26T15:05:46 from xxx TSIG verified with key yyy”

15:13:53 nsd-trial[635022]: zone test. received update to serial 1690380825 at 2023-07-26T15:05:46 from xxx TSIG verified with key yyy of 1604285929 bytes in 837.778 seconds

15:14:03 nsd-trial[635021]: zone test serial 1690380104 is updated to 1690380825

  1. test. zone got 1K RRs more. Hence IXFR with 1k RRs. The IXFR was applied very fast, no memory increase.

23:25:38 nsd-trial[635021]: xfrd: zone test committed “received update to serial 1690380826 at 2023-07-26T23:25:38 from xxx TSIG verified with key yyy”

23:25:41 nsd-trial[635022]: zone test. received update to serial 1690380826 at 2023-07-26T23:25:38 from xxx TSIG verified with key yyy of 33289 bytes in 0.016273 seconds

23:25:43 nsd-trial[635021]: zone test serial 1690380825 is updated to 1690380826

  1. test. was reduced to 5 RRs: → AXFR-style IXFR. Memory consumption heavily increases until oom kicks in:

23:31:48 nsd-trial[635021]: xfrd: zone test committed “received update to serial 1690380827 at 2023-07-26T23:31:48 from xxx TSIG verified with key yyy”

23:32:32 kernel: nsd: server 1 invoked oom-killer: gfp_mask=0x1100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0

23:32:33 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/system-nsd.slice/nsd@trial.service,task=nsd: server 1,pid=709906,uid=111

23:32:33 kernel: Out of memory: Killed process 709906 (nsd: server 1) total-vm:14673408kB, anon-rss:13054016kB, file-rss:0kB, shmem-rss:384kB, UID:111 pgtables:28720kB oom_score_adj:0

23:32:40 kernel: oom_reaper: reaped process 709906 (nsd: server 1), now anon-rss:0kB, file-rss:0kB, shmem-rss:512kB

23:32:40 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/system-nsd.slice/nsd@trial.service,task=nsd: main,pid=635022,uid=111

23:32:40 kernel: Out of memory: Killed process 635022 (nsd: main) total-vm:14657592kB, anon-rss:14612092kB, file-rss:0kB, shmem-rss:588kB, UID:111 pgtables:28724kB oom_score_adj:0

23:32:47 kernel: oom_reaper: reaped process 635022 (nsd: main), now anon-rss:0kB, file-rss:0kB, shmem-rss:588kB

So, even that there were ~6GB RAM available, NSD could not replace the currently serving zone (50mio RRs) with a small zone with 5RRs.

I wonder, why does NSD needs so much memory to apply the “AXFR-style IXFR”? Is this by design, or a bug?

(On servers with more RAM overhead, step 4 succeeded, but still took 1 minute to serve the new zonen and memory peaked at least to 44GB RAM, so 10GB or more RAM to switch to the small new zone version):

23:31:48 nsd-trial[756415]: xfrd: zone test committed “received update to serial 1690380827 at 2023-07-26T23:31:48 from xxx TSIG verified with key yyy”

23:32:58 nsd-trial[756416]: zone test. received update to serial 1690380827 at 2023-07-26T23:31:48 from xxx TSIG verified with key yyy of 182 bytes in 8.9e-05 seconds

23:32:58 nsd-trial[756415]: zone test serial 1690380826 is updated to 1690380827

Thanks

Klaus

Hi Klaus,

So, even that there were ~6GB RAM available, NSD could not replace
the currently serving zone (50mio RRs) with a small zone with 5RRs.

I wonder, why does NSD needs so much memory to apply the "AXFR-style
IXFR"? Is this by design, or a bug?

This is a design bug/feature of NSD. When it receives updates, it forks a child process. This child process consumes the XFR, and updates its in-memory view of the zone. Once it has done that, the previous server process is killed, and this new child process takes over to answer queries.

This design makes NSD consume more memory when processing updates. We have run into this problem as well.

There are 2 ways to work around this:

1. Add more RAM to the server; or
2. add swap space to the server.

The swap space allows the kernel to allocate more memory to NSD when it forks. The swap space will not actually be used, because the old NSD process and the new one will be identical, and share the memory space. When the new process updates one or some of its zones, only that memory will be modified, and a little extra RAM will be consumed.

We have NSD servers will limited RAM, but the swap space trick helps them function without being killed by the out-of-memory killer.

I'm not an expert on Linux kernel terminology, so please excuse my rather simple explanation.

Regards,
Anand Buddhdev
RIPE NCC