DBMS Design Decisions
As with any complex software, designing and implementing a database management solution involves setting goals and making decisions and tradeoffs to meet those goals as well as you can. When we designed Objectivity/DB we set down ten major goals. The first four are relevant to this series of posts, namely - Reliability, Performance, Scalability and Distributability.
Transaction management, particularly controlling the in-memory state of data and the persistent data on disk, is fundamental to achieving reliability. However, writing every operation or database change to a separate log file can degrade performance. That becomes increasingly important with high throughput systems in constrained environments. Imagine a one Terabyte database in a satellite. If I start a transaction that scans through every object, updating a single field, then creating new objects and linking them to the original objects I may have to create a journal that is a Terabyte or more in size. What if I want to do even more in the transaction, such as using the old and new objects to generate even more objects using an iterative algorithm that may cycle through the objects many times? I may not have enough disk space in my constrained hardware environment to complete the transaction.
So, we decided to use a form of distributed, hybrid, shadow paging instead of a conventional log. The technique has actually been used in quite a few DBMSs as you can always predict the maximum amount of scratch space that a transaction will require. Updated database pages are written to a different physical location than the original logical page. The transaction commit operation safeguards the database by writing the old page map and the new page map to a separate log file. If all of the data remaining in memory is successfully flushed to disk and made persistent by calling fsync, or something similar, the new page map can also be made persistent and the log file can then be deleted. If the transaction is interrupted at any point, right up to the final removal of the log file, it can be rolled back by the owner process or another one that has sufficient permissions. The old physical pages can then be returned to a free file space pool. In the scenario I described above, we can write and rewrite the Terabyte of data many times and we'll never need more than two Terabytes plus a small log file.
Dealing with distributed databases slightly complicates the mechanism, as you may successfully update several database files on various machines and then fail to reach another one, so the whole transaction must be rolled back. We decided to use lock servers to help control transactions. If a client process dies while it is holding locks you can set a lock server to automatically rollback the transactions that the failed process owned.
Transaction management, particularly controlling the in-memory state of data and the persistent data on disk, is fundamental to achieving reliability. However, writing every operation or database change to a separate log file can degrade performance. That becomes increasingly important with high throughput systems in constrained environments. Imagine a one Terabyte database in a satellite. If I start a transaction that scans through every object, updating a single field, then creating new objects and linking them to the original objects I may have to create a journal that is a Terabyte or more in size. What if I want to do even more in the transaction, such as using the old and new objects to generate even more objects using an iterative algorithm that may cycle through the objects many times? I may not have enough disk space in my constrained hardware environment to complete the transaction.
So, we decided to use a form of distributed, hybrid, shadow paging instead of a conventional log. The technique has actually been used in quite a few DBMSs as you can always predict the maximum amount of scratch space that a transaction will require. Updated database pages are written to a different physical location than the original logical page. The transaction commit operation safeguards the database by writing the old page map and the new page map to a separate log file. If all of the data remaining in memory is successfully flushed to disk and made persistent by calling fsync, or something similar, the new page map can also be made persistent and the log file can then be deleted. If the transaction is interrupted at any point, right up to the final removal of the log file, it can be rolled back by the owner process or another one that has sufficient permissions. The old physical pages can then be returned to a free file space pool. In the scenario I described above, we can write and rewrite the Terabyte of data many times and we'll never need more than two Terabytes plus a small log file.
Dealing with distributed databases slightly complicates the mechanism, as you may successfully update several database files on various machines and then fail to reach another one, so the whole transaction must be rolled back. We decided to use lock servers to help control transactions. If a client process dies while it is holding locks you can set a lock server to automatically rollback the transactions that the failed process owned.
Labels: DBMS, Log files, performance, Shadow paging

0 Comments:
Post a Comment
Links to this post:
Create a Link
<< Home