File handling for merge sort. More...
#include <tpie/serialization_sorter.h>
File handling for merge sort.
This class abstracts away the details of numbering run files; tracking the number of runs in each merge level; informing the TPIE stats framework of the temporary size; deleting run files after use.
The important part of the state is the tuple consisting of (a, b, c) := (fileOffset, nextLevelFileOffset, nextFileOffset). a
is the first file in the level currently being merged; b
is the first file in the level being merged into; c
is the next file to write output to.
We let remainingRuns := b - a, and nextLevelRuns := c - b.
The tuple (remainingRuns, nextLevelRuns) has the following transitions: On open_new_writer(): (x, y) -> (x, 1+y), On open_readers(fanout): (fanout+x, y) -> (fanout+x, y), On open_readers(fanout): (0, fanout+y) -> (fanout+y, 0), On close_readers_and_delete(): (fanout+x, y) -> (x, y).
During run formation (the first phase of merge sort), we repeatedly call open_new_writer() and close_writer() to write out runs to the disk.
After run formation, we call open_readers(fanout) to advance into the first level of the merge heap (so one can think of run formation as a "zeroth level" in the merge heap).
As a slight optimization, when remaining_runs() == 1, one may call move_last_reader_to_next_level() to move the remaining run into the next merge level without scanning through and copying the single remaining run.
See serialization_sorter::merge_runs() for the logic involving next_level_runs() and remaining_runs() in a loop.
Definition at line 265 of file serialization_sorter.h.