[unbound] memory usage

Nicolas_Baumgarten · March 16, 2024, 3:01am

Hi,
we have been using unbound for a long time, and we are very happy with it.

But I would like to know a little about memory usage.
LAtely we are seeing that unbound process grows using all memory and start swapping causing a big loss of performance (latency, dropped packets, etc)

The question is that stats metrics (mem*) are stable . They rapidly grow after startup and stay at a logical
maximum and don’t keep growing.

But the process size does.

For example, two servers, same config, same hardware:
version 1.9.1, on redhat 8.7

Server A uptime 2 hours:
unbound-control stats_noreset | grep mem
mem.cache.rrset=285212642
mem.cache.message=142606338
mem.mod.iterator=16748
mem.mod.validator=25689380
mem.mod.respip=0
mem.mod.subnet=61555940
mem.streamwait=0
mem.http.query_buffer=0
mem.http.response_buffer=0

Unbound proc RES size 1.6GB, VIRT 1,8 GB

Server B uptime 6 days.
mem.cache.rrset=285212302
mem.cache.message=142606461
mem.mod.iterator=16748
mem.mod.validator=25689867
mem.mod.respip=0
mem.mod.subnet=142614402
mem.streamwait=0
mem.http.query_buffer=0
mem.http.response_buffer=0

Unbound proc RES size 5.5GB, VIRT 6.2 GB

As you can see the only difference in memory is the mod.subnet which is 60Mb vs 140Mb, but this limit is reached at 4 or 5 hours of
running and stays there.

Why is it using almost 4 GB more after a couple of days while caches are stable??
There is some way to control this?

Whe are restarting unbound every two days now (while waiting for a little bit more of ram)

Thanks!!

Olivier_Benghozi · March 16, 2024, 3:47am

Hi,
We had some memory issues here with unbound after a Debian update with a new kernel.
It was a (well known) transparent_hugepage issue. Changed it from enabled to madvise, problem fixed.

You might maybe (or maybe not) hit the same issue?
Check https://access.redhat.com/solutions/46111

And maybe just check for yourself right now on runtime:
echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
echo madvise > /sys/kernel/mm/transparent_hugepage/defrag

then restart unbound…

Tarko_Tikan · March 16, 2024, 9:23am

hey,

The question is that stats metrics (mem*) are stable . They rapidly grow after startup and stay at a logical
maximum and don't keep growing.

But the process size does.

We saw the same on Debian due to THP like Oliver already mentioned. See https://github.com/NLnetLabs/unbound/issues/724

We "solved" this by disabling THP on our unbound machines.

yvoinov · March 16, 2024, 1:07pm

It is necessary to disable THP so often that I had to write a script for this purpose. Doing this using the service is much more convenient than doing it any other way, in my opinion.

https://github.com/yvoinov/memory-tools/blob/main/disable_thp.sh

Olivier_Benghozi · March 16, 2024, 3:09pm

Actually on debian we use the standardly available sysfsutils, which I recommend for this (and which is available in Redhat, too) ; it is the equivalent for /sys than sysctl is for /proc:

install sysfsutils either with apt install or dnf install
config the stuff:
echo “kernel/mm/transparent_hugepage/enabled = madvise” > /etc/sysfs.d/transparent_hugepage.conf
echo “kernel/mm/transparent_hugepage/defrag = madvise” >> /etc/sysfs.d/transparent_hugepage.conf
systemctl enable sysfsutils.service
either reboot or just restart the services including unbound:
systemctl restart sysfsutils.service

systemctl restart unbound.service

That’s it, the conf will survive reboot and updates.

Nicolas_Baumgarten · March 16, 2024, 9:18pm

Thanks!!!

I disabled THP on one server to compare, and it looks promising.