Incremental Updates

Timothy_Der · August 26, 2005, 8:14am

I read the old thread regarding a dynamic update backend, but I am interested in being able to do incremental updates (appending/inserting new data into the exsisting DB), so our numerous zone files dont all have to be reparsed again. Is it feasable with the current setup? or will significant kludging be in order to facilitate this? I would prefer to just go with a dns server with a DB backend but others are naturally resistant, and like their zone files but not the slow speed they bring with them (generally speaking).
Any hints would be appreciated.

Timothy Der

Erik_Rozendaal1 · August 26, 2005, 12:32pm

Timothy Der wrote:

I read the old thread regarding a dynamic update backend, but I am interested in being able to do incremental updates (appending/inserting new data into the exsisting DB), so our numerous zone files dont all have to be reparsed again. Is it feasable with the current setup? or will significant kludging be in order to facilitate this? I would prefer to just go with a dns server with a DB backend but others are naturally resistant, and like their zone files but not the slow speed they bring with them (generally speaking).
Any hints would be appreciated.

How much time does zonec currently spend parsing and compiling the database? A quick test run shows that parsing and compiling the database is about 2.5 times slower than loading the database. So that will be about the maximum amount that can be gained with some kind of incremental updates.

Erik

Peter_A_Friend · August 26, 2005, 4:37pm

Well as an example, parsing and building that database for over 300,000 zones can take an hour, so just making a change to one zone isn't possible, and has to wait for the next build. What would be ideal is to be able to take an existing built database, parse the zone files that have changed, and update the database. I have looked at the code, and the problem I have is that there is no way to free a pointer within a region, and I also don't see any delete routines for the red black tree stuff. This leaves stuffing the records into their existing entry (for a zone that was already in the database), and I haven't figured out a safe way to do that yet. Once the "diff" is applied, then the database file can be written out again, and reloaded by nsd.

Peter

Timothy_Der · August 26, 2005, 5:13pm

Erik Rozendaal wrote:

Timothy Der wrote:

I read the old thread regarding a dynamic update backend, but I am interested in being able to do incremental updates (appending/inserting new data into the exsisting DB), so our numerous zone files dont all have to be reparsed again. Is it feasable with the current setup? or will significant kludging be in order to facilitate this? I would prefer to just go with a dns server with a DB backend but others are naturally resistant, and like their zone files but not the slow speed they bring with them (generally speaking).
Any hints would be appreciated.

How much time does zonec currently spend parsing and compiling the database? A quick test run shows that parsing and compiling the database is about 2.5 times slower than loading the database. So that will be about the maximum amount that can be gained with some kind of incremental updates.

Erik

Thanks for the quick reply, It takes just over an hour to rebuild the db. So you figure that the shortest possible compile time (with incremental updates) would be at least 2.5 times the load time? Thanks for your help

Tim

Erik_Rozendaal1 · August 26, 2005, 7:09pm

Peter A. Friend wrote:

Well as an example, parsing and building that database for over 300,000 zones can take an hour, so just making a change to one zone isn't possible, and has to wait for the next build.

Is this with NSD 2.3.0? Kazunori Fujiwara provided us with a patch that speeds up zonec a lot when many zones need to be parsed. This patch is part of 2.3.0.

What would be ideal is to be able to take an existing built database, parse the zone files that have changed, and update the database. I have looked at the code, and the problem I have is that there is no way to free a pointer within a region, and I also don't see any delete routines for the red black tree stuff. This leaves stuffing the records into their existing entry (for a zone that was already in the database), and I haven't figured out a safe way to do that yet. Once the "diff" is applied, then the database file can be written out again, and reloaded by nsd.

Regions were implemented so everything that needs to be free'd together is allocated into a single region. This helps performance and makes the logic for free'ing large amounts of data simpler. So if parts of can be free'd earlier you should use a separate region and free that.

Another approach you can consider (if 2.3.0 zonec is still too slow) is to modify NSD so it can load multiple databases on startup and compile every zone (or some grouping of zones) into separate .db files.

Of course, with this modification NSD will still have to read in all the .db files. But hopefully that will be fast enough.

Erik

Erik_Rozendaal1 · August 26, 2005, 7:12pm

Timothy Der wrote:

Thanks for the quick reply, It takes just over an hour to rebuild the db. So you figure that the shortest possible compile time (with incremental updates) would be at least 2.5 times the load time? Thanks for your help

Well, if zonec takes an hour to generate the database, then it would take NSD about 25 minutes to load the database into memory (assuming the factor of 2.5 is correct).

If NSD loads much quicker there may be some other bottleneck in zonec.

Erik

Timothy_Der · August 26, 2005, 7:20pm

Erik Rozendaal wrote:

Timothy Der wrote:

Thanks for the quick reply, It takes just over an hour to rebuild the db. So you figure that the shortest possible compile time (with incremental updates) would be at least 2.5 times the load time? Thanks for your help

Well, if zonec takes an hour to generate the database, then it would take NSD about 25 minutes to load the database into memory (assuming the factor of 2.5 is correct).

If NSD loads much quicker there may be some other bottleneck in zonec.

Erik

Thanks, I'm trying to get the actual numbers now. In your opinion is it doable to add incremental update functionality without being too intrusive on the existing code base? (I'm guessing modification to zonec would be the only changes). Can new records be simply appended to the DB or would insertion have to be done? I'm still kind of fuzzy on the DB file setup.

Tim

Erik_Rozendaal1 · August 26, 2005, 7:27pm

Timothy Der wrote:

Thanks, I'm trying to get the actual numbers now. In your opinion is it doable to add incremental update functionality without being too intrusive on the existing code base? (I'm guessing modification to zonec would be the only changes). Can new records be simply appended to the DB or would insertion have to be done? I'm still kind of fuzzy on the DB file setup.

The database format really isn't designed for incremental changes. The multiple database approach I outlined in the e-mail to Peter Friend is probably easier.

Another approach might be to have zonec load an existing database, then parse the updated zone from the zone file and write a completely new DB file.

Erik

Peter_A_Friend · August 26, 2005, 9:00pm

The database format really isn't designed for incremental changes. The
multiple database approach I outlined in the e-mail to Peter Friend is
probably easier.

This is an interesting approach. We are trying to avoid the IO overhead in loading such a large number of files. Another test was done where we copied the zone files to a tmpfs backed by ram. The combination of copy from disk to tmpfs and parsing from there was substantially cheaper than a parse from disk. I don't have detailed numbers yet. I am using the latest version.

Another approach might be to have zonec load an existing database, then
parse the updated zone from the zone file and write a completely new DB
file.

Yes, that's exactly what I was proposing, I just didn't word it very well. And that's where I got stuck, since with the existing DB loaded into memory, I need to either delete the existing entry or just overwrite it. If it can be overwritten safely, then dumping the DB back to disk for eventual reload by nsd looks easy.

Peter

Timothy_Der · August 26, 2005, 9:11pm

Erik Rozendaal wrote:

Timothy Der wrote:

Thanks for the quick reply, It takes just over an hour to rebuild the db. So you figure that the shortest possible compile time (with incremental updates) would be at least 2.5 times the load time? Thanks for your help

Well, if zonec takes an hour to generate the database, then it would take NSD about 25 minutes to load the database into memory (assuming the factor of 2.5 is correct).

If NSD loads much quicker there may be some other bottleneck in zonec.

Erik

Attached is the answer I recieved.

(attachments)

Attached Message (1.71 KB)

Erik_Rozendaal1 · August 29, 2005, 8:15am

Peter A. Friend wrote:

This is an interesting approach. We are trying to avoid the IO overhead in loading such a large number of files. Another test was done where we copied the zone files to a tmpfs backed by ram. The combination of copy from disk to tmpfs and parsing from there was substantially cheaper than a parse from disk. I don't have detailed numbers yet. I am using the latest version.

That's rather strange. Zonec only performance a single fopen/fclose per zone file. Copy should do something similar.

How are the zone files organized? Subdivided into many directories? In that case it might be worth it to make sure the .zones file is organized so that all zones from a single directory are parsed first before moving on to the zones from the next directory. Basically sort the .zones file on the dirname/filename field for each zone.

Erik