Problems loading zone with nsd4

Hi!

I just tried to switch from nsd3 to nsd4 but nsd4 fails to load the
zone. The zone is approx 170MB (Bind text format).

NSD4 is configured is slave. The zone is transferred but failed during
loading with memory allocation error:

10:17:16 nsd[3849]: zonefile at.zone does not exist
10:17:16 nsd[3849]: nsd started (NSD 4.0.3), pid 3847
10:17:26 nsd[3847]: xfrd: zone at committed "received update to serial
1402473601 at 2014-06-11T10:17:26 from 83.136.34.4 TSIG verified with
key rcode0-distribution"
10:17:26 nsd[4059]: rehash of zone at. with parameters 1 0 5
b81fd4d081abe7a4
10:17:59 nsd[4059]: mremap(/var/lib/nsd/nsd.db, size 1743910912) error
Cannot allocate memory
10:17:59 nsd[4059]: could not add RR to nsd.db, disk-space?
10:17:59 nsd[4059]: bad ixfr packet part 2007 in diff file for at.
10:17:59 nsd[3849]: handle_reload_cmd: reload closed cmd channel
10:17:59 nsd[3849]: Reload process 4059 failed with status 256,
continuing with old database
10:17:59 nsd[3847]: xfrd: zone at: soa serial 1402473601 update failed,
restarting transfer (notified zone)

The server has 8GB RAM, 512KB swap and 9GB of free disk space, and there
is plenty of disk and ram left when nsd logs the memory error.

Thus, I suspect something else is going wrong. Any hints?

Thanks
Klaus

Hi Klaus,

Hi!

I just tried to switch from nsd3 to nsd4 but nsd4 fails to load
the zone. The zone is approx 170MB (Bind text format).

NSD4 is configured is slave. The zone is transferred but failed
during loading with memory allocation error:

10:17:16 nsd[3849]: zonefile at.zone does not exist 10:17:16
nsd[3849]: nsd started (NSD 4.0.3), pid 3847 10:17:26 nsd[3847]:
xfrd: zone at committed "received update to serial 1402473601 at
2014-06-11T10:17:26 from 83.136.34.4 TSIG verified with key
rcode0-distribution" 10:17:26 nsd[4059]: rehash of zone at. with
parameters 1 0 5 b81fd4d081abe7a4 10:17:59 nsd[4059]:
mremap(/var/lib/nsd/nsd.db, size 1743910912) error Cannot allocate
memory 10:17:59 nsd[4059]: could not add RR to nsd.db, disk-space?
10:17:59 nsd[4059]: bad ixfr packet part 2007 in diff file for at.
10:17:59 nsd[3849]: handle_reload_cmd: reload closed cmd channel
10:17:59 nsd[3849]: Reload process 4059 failed with status 256,
continuing with old database 10:17:59 nsd[3847]: xfrd: zone at: soa
serial 1402473601 update failed, restarting transfer (notified
zone)

The server has 8GB RAM, 512KB swap and 9GB of free disk space, and
there is plenty of disk and ram left when nsd logs the memory
error.

Thus, I suspect something else is going wrong. Any hints?

I think it is the memory somehow, perhaps the memory overcommit kernel
settings in Linux are disallowing the allocation, even though there
seems to be enough memory (at current usage). The most recent code
from the repository (not yet released, but passed regression tests)
has the option to use database: "" in nsd.conf and then the nsd.db is
not made, not mmapped, and thus a lot of disk and memory space is
freed. That would likely make your system work.

Best regards,
   Wouter

May the problem be related that my test server uses a 32bit kernel?

regards
Klaus

Strange, on another server (64bit linux) I can load the zone although
the server only has 1GB Ram and 2GB swap.

regards
Klaus

Hi Klaus,

Do you have a ulimit (heap size) set that constrains memory usage
per-process?

Best regards, Wouter

Seems like the "working" server has even less resources:

Debian Wheezy 32bit, 8GB Ram, 8GB Swap, not working:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 64637
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 64637
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Ubuntu 12.04 64bit, 1GB Ram, 2GB Swap, working:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 7758
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 7758
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

regards
Klaus

Quoting Klaus Darilion <klaus.mailinglists@pernau.at>:

Hi!

I just tried to switch from nsd3 to nsd4 but nsd4 fails to load the
zone. The zone is approx 170MB (Bind text format).

10:17:59 nsd[4059]: mremap(/var/lib/nsd/nsd.db, size 1743910912) error
Cannot allocate memory

[...]

The server has 8GB RAM, 512KB swap and 9GB of free disk space, and there
is plenty of disk and ram left when nsd logs the memory error.

How confident are you on the "there is plenty of disk and ram left"
statement? In particular, you've got 8GB of memory available,
virtually no swap, and you're trying mmap a file that is 1.7GB.
I don't know how many processes are trying to mmap that file, but
you've not got a lot of head room there.

With respect to the other server working with fewer resources, do you
by chance have overcommit disabled on the broken box and enabled on the
working box?

Also, compare the sizes of /var/lib/nsd/nsd.db on the two machines.
Perhaps the working box is able to mmap because the db file is smaller on
that one.

Is allocating more swap on the broken box an option? Other than using
up disk space, having lots of swap isn't a downside as long as your
paging rate is low. That is probably true even if your swap is on
SSD/flash; with a low paging rate, you shouldn't run into flash wear
issues due to using swap. Maybe add a swap file to test?

Devin

Hi Devin!

Thanks for your comments.

Meanwhile I use 8GB Ram, 8GB Swap, 9GB free disk space and it still fails:
Jun 26 15:49:36 bulgari nsd[30968]: mremap(/var/lib/nsd/nsd.db, size
1743911936) error Cannot allocate memory

The smaller (working) server has the same overcommit settings:

# cat /proc/sys/vm/overcommit_memory
0

Maybe the memory mapping hits some limits on 32bit OS.

regards
Klaus

Maybe the memory mapping hits some limits on 32bit OS.

That will be it.

Any modern x86 hardware should be run with a 64 bit kernel, and the vast
majority with 64 bit userland.

The 64 bit code should be more efficient, too, given the benefits of the
amd64 instruction set over the ia32 set.

-JimC