Set *-slabsto a power of 2 close to thenum-threadsvalue. Do this formsg-cache-slabs,rrset-cache-slabs,infra-cache-slabsandkey-cache-slabs. This reduces lock contention.
I service several hundred thousands of simultaneous clients with 10,000s queries per second on only 12 threads. Cache response time is less than 1ms, average response time is < 10ms. My hosts (I have 3 of them) have 16 threads/cores each, I leave 4 threads to do some server busy work like stats and logs collection. More threads doesn’t always mean better performance and in your case since your slab count is low you’re going to have a lot of lock contention.
Just wanted to find out if there is a way to measure the cache hit resolution time in a dashboard?
Unbound has no facility to measure cache hit resolution time.
A method to measure cache hit time is to make query to names which always resolved with cache hit e.g. “dig -t NS . “; If the resolver is too busy and queries always remain stuck in its receive queue, its response (even for cache hit) would be delayed due to the queue dwell time, and queries may even be dropped.
Yes as Daisuke said, it’s a very unscientific approach to measure this. We use data from our load test rig plus some baseline network latency to arrive at estimates.
Our average also includes timeouts from some exotic domains and records that do not exist which probably originate from malware and all sorts of crap on our clients devices. It’s amazing the junk that people try to access.
How did your test go with the tuning already suggested, did you see any improvements?