Stephane Bortzmeyer wrote:
a message of 36 lines which said:
responses are generated on-the-fly instead of being precompiled in the database,
What are the consequences on performance? I thought that one reason
why nsd was so much faster (and does not load the CPU as much as BIND)
is the precompiled responses.
NSD 1.4.0 is slower than NSD 1.2.2 and previous versions. The actual drop in performance very much depends on:
1. Zone size.
2. CPU performance of the server.
3. Network I/O performance of the server.
Remember that on our test platform (Athlon XP 2400+, cheap and probably slow network card) NSD 1.2.x and the OS spends most of its time doing network I/O. This is still true for NSD 1.4.0.
With a small zone (for example, the root zone) the rest of the time is divided in name lookup and generating the response. NSD 1.4.0 is about 30% slower CPU wise than NSD 1.2.2. In one test we did that resulted in NSD 1.2.2 being able to respond to ~18770 packets/sec, with NSD 1.4.0 getting ~15964.
With a large zone (900,000+ domains) most of the non-network I/O cpu time is done performing name lookups. Generating packets take a relatively small percentage of CPU time, so the difference between 1.2.2 and 1.4.0 is much less: 1.2.2: 15649 p/s vs 1.4.0: 15082 p/s.
Another difference is memory usage. NSD 1.4.0 uses less memory than 1.2.2 on a 32-bit platform. Memory usage is probably similar or slightly more on a 64-bit platform due to the heavy use of pointers in NSD 1.4.0.
Note that NSD 1.4.0 is still much faster than bind 8.3.6 (~4 times less CPU usage, NSD 1.2.2 is about ~4.5 less CPU usage in the tests that I did using a large zone). In that test run NSD 1.2.2 managed to answer about 4.2 million packets, 1.4.0 about 4.1 million, and bind 8.3.6 about 1.2 million. I'm not sure how many packets actually got send and I don't have the notes here right now, so these numbers are what I can remember 
Note that all these numbers are preliminary. I've spend enough time performance testing to see performance was not greatly affected due to the changes, but not enough to get precise, high quality numbers.
If we start generating responses on-the-fly, what about round-robin?
I don't think we want to go there. I also do not know what effects round-robin has on DNSSEC etc. Right now our priority is to get DNSSEC working correctly. We knew DNSSEC would have some performance cost and so far the cost seems acceptable to us. After DNSSEC is working correctly we may look at cpu and memory usage to see what we can improve.
Hopefully people will test performance on their own machines and let us know if the performance drop is not acceptable. I'm especially interested in numbers from servers with (relatively) slow CPUs and fast network I/O.
A final note on why we made these changes. Adding DNSSEC in the "obvious" way to the precompiled answer database would probably have resulted in a database that was 2-3 times larger just because of duplication of answer data (non-DNSSEC answer, DNSSEC answer, DNSSEC NXDOMAIN answer). Combined with the fact that a signed zone is about 4-6 (?) times larger than the non-signed zone this would result in a database size that was 8-18 times larger. Even in the best case that would result in a 1.3 gig NL zone database (170 Mb currently). This gets pretty close to the limits of 32-bit machines and architectures.
The alternative would be to cut these packets up (non-DNSSEC part, DNSSEC parts) and combine them depending on the query. This would require less memory but more CPU time. We figured this approach was too complicated and would not gain us much in CPU performance over the approach taken with NSD 1.4.0. The signed NL zone 1.4.0 database is just 264 Mb, with NSD using about ~450 Mb memory when running on a 32-bit machine. The signed zone file itself is 325 Mb, larger than the compiled database 
I hope this clarifies things more than it obfuscates 
Erik