<b><u>Method with double Buffer Pages</u></b>

<u>Double Buffering</u>

Consider what happens in the external sorting algorithm when all the tuples in an input block have been consumed: An I/O request is issued for the next block of tuples in the corresponding input run, and the execution is forced to suspend until the I/O is complete. That is, for the duration of the time taken for reading in one block, the CPU remains idle (assuming that no other jobs are running). The overall time taken by and algorithm can be increased considerably because the CPU is repeatedly forced to wait for an I/O operation to complete. This effect becomes more and more important as CPU speeds increase relative to I/O speeds, which is a long-standing trend in relatice speeds. It is therefore desirable to keep the CPU busy while an I/O request is being carried out, that is, to overlap CPU and I/O processing. Current hardware supports such overlapped computation, and it is therefore desirale to design algorithms to take advantage of this capability.

In the context of ecternal sorting, we can achieve this overlap by allocating extra pages to each buffer. Suppose that o block size of b = 32 is chosen. The idea is to allocate an additional 32-page block to every input (and the output) buffer. Now, whenall the tuples in a 32-page block have been consumed, the CPU can process the next 32 pages of the run by switching to the second , "double", block for this run. Meanwhile, an I/O request is issued to fill the empty block. Thus, assuming that the time to consume a block is greater than the time to read in a block, the CPU is never idle! On the other hand, the number of pages allocated to a buffer is doubled (for a given block size, which means the total I/O cost stays the same). This technique is called <b>double buffering</b>, and it can considerably reduce the total time taken to sort a file.

Note that although double buffering can considerably reduce the response time for a given query, ti may not have significant impact on throughput, because the CPU can be kept busy by working on other queries qhile waiting for one query's I/O operation to complete.