Polyphase merge sort

A polyphase merge sort is an algorithm which decreases the number of runs at every iteration of the main loop by merging runs into larger runs. It is used for external sorting.

Ordinary merge sort

Typically, a merge sort splits items into sorted runs and then recursively merges each run into larger runs. When there's only one run left, that is the sorted result.

An ordinary merge sort could use four working files organized as a pair of input files and a pair of output files. At each iteration, two input files are read. The odd numbered runs of the two input files are merged to the first output file, and the even numbered runs are merged to the second output file. When the input is exhausted, the new output files are used as the input for the next iteration. The number of runs decreases by a factor of 2 at each iteration. At each iteration, the same level/phase of merge occurs—a file is either completely read or completely written during an iteration.

If the four files were on four separate tape drives, watching an ordinary merge sort would show some interesting details. On the first iteration, only one input drive is used—the other input file is empty. On subsequent iterations, each input drive runs at half speed,^[1] while one output drive runs at full speed and the second output drive stands idle waiting for the next run. The situation is even worse when six tape drives are used—at least two will stand idle. Someone watching the tapes spin would wonder if the idle drives could be more useful.

The polyphase merge found a way to use the idle drives. It can sort using just three sequential files rather than the four required by merge sort.

Polyphase merge

The polyphase merge changes the game. There might be $N$ files, but the polyphase merge will read from $N-1$ files and write only one output file at a time. The writing to that output file continues until an input file is exhausted, and then that input file becomes the new output file. The total number of runs in the set of files is related to Fibonacci numbers and Fibonacci numbers of higher order.^[2]^[3]

Perfect 3 file polyphase merge sort

It is easiest to look at the polyphase merge starting from its ending conditions and working backwards. At the start of each iteration, there will be two input files and one output file. At the end of the iteration, one input file will have been completely consumed and will become the output file for the next iteration. The current output file will become an input file for the next iteration. The remaining files (just one in the 3 file case) have only been partially consumed and their remaining runs will be input for the next iteration.

File 1 just emptied and became the new output file. One run is left on each input tape, and merging those runs together will make the sorted file.

File 1 (out):                                           <1 run> *        (the sorted file)
File 2 (in ): ... | <1 run> *               -->     ... <1 run> | *          (consumed)
File 3 (in ):     | <1 run> *                           <1 run> | *          (consumed)

...  possible runs that have already been read
|    marks the read pointer of the file
*    marks end of file

Stepping back to the previous iteration, we were reading from 1 and 2. One run is merged from 1 and 2 before file 1 goes empty. Notice that file 2 is not completely consumed—it has one run left to match the final merge (above).

File 1 (in ): ... | <1 run> *                      ... <1 run> | *
File 2 (in ):     | <2 run> *           -->            <1 run> | <1 run> *
File 3 (out):                                          <1 run> *

Stepping back another iteration, 2 runs are merged from 1 and 3 before file 3 goes empty.

File 1 (in ):     | <3 run>                        ... <2 run> | <1 run> *
File 2 (out):                               -->        <2 run> *
File 3 (in ): ... | <2 run> *                          <2 run> | *

Stepping back another iteration, 3 runs are merged from 2 and 3 before file 2 goes empty.

File 1 (out):                                          <3 run> *
File 2 (in ): ... | <3 run> *               -->    ... <3 run> | *
File 3 (in ):     | <5 run> *                          <3 run> | <2 run> *

Stepping back another iteration, 5 runs are merged from 1 and 2 before file 1 goes empty.

File 1 (in ): ... | <5 run> *                      ... <5 run> | *
File 2 (in ):     | <8 run> *               -->        <5 run> | <3 run> *
File 3 (out):                                          <5 run> *

Distribution for polyphase merge sort

Looking at the the perfect 3 file case, the number of runs for merged working backwards: 1, 1, 2, 3, 5, ... reveals a Fibonacci sequence. The sequence for more than 3 files is a bit more complicated; for 4 files, starting at the final state and working backwards, the run count pattern is {1,0,0,0}, {0,1,1,1}, {1,0,2,2}, {3,2,0,4}, {7,6,4,0}, ... .

For everything to work out right, the initial file to be sorted must be distributed to the proper input files and each input file must have the correct number of runs on it. In the example, that would mean an input file with 13 runs would write 5 runs to file 1 and 8 runs to file 2.

In practice, the input file won't happen to have a ideal number of runs (and the number of runs won't be known until after the file has been read). The fix for this is to logically pad the input files as needed with dummy runs of size 0 to create the ideal initial pattern. Since the dummy runs have size 0, some method to keep of track dummy runs is used. Merging one or more dummy runs with one or more real runs just merges the real runs. If merging one or more dummy runs with just one real run, the real run is just copied.

Comparison versus ordinary merge sort

An ordinary merge sort using 4 files will combine 16 runs in 4 passes (complete reads of all input files). The polyphase merge using 3 files will combine 13 runs in 5 phases (complete read of one input file). Alternatively, a polyphase merge using 4 files will combine 17 runs in 4 phases: starting runs are {7,6,4,0}, at the end of phase 1 the runs are {3,2,0,4}, at the end of phase 2 the runs are {1,0,2,2}, at the end of phase 3 runs are {0,1,1,1}, and at the end of phase 4 the runs are {1,0,0,0} (i.e., one sorted run).

An iteration (or pass) in ordinary merge sort involves reading and writing the entire file. An iteration in a polyphase sort does not read or write the entire file,^[4] so a typical polyphase iteration will take less time than a merge sort iteration. Additionally, on tapes that can be read backward (even if they can only be written forward) there will be no intermediate rewinds: after the distribution phase (where the input tape contents are distributed among the other tapes) all tapes are read only backward. This means "straight runs" and "reversed runs" have to be set up correctly so that the last run on each tape is a reversed run which, read backward, produces one sorted forward run on the final output tape.

For an ordinary merge sort, the run size doubles on each merge iteration, while for polyphase merge sort, the run size increases by less than double, following the pattern for the number of files involved. In general, polyphase merge sort is better than ordinary merge sort when there are less than 8 files.^[5], while ordinary merge sort starts to become better at around 8 or more files. For an external disk sort with a large amount of memory to allow large sequential read / write operations to reduce random access overhead, an ordinary merge sort using a 9 (18 files) to 16 (32 files) way merge is faster (see external sort).

References

↑ The two input drives are throttled by the output drive's speed. They cannot provide data faster than the output drive can write it.
↑ Donald Knuth, The Art of Computer Programming, Volume 3, Addison Wesley, 1973, Algorithm 5.4.2D.
↑ http://oopweb.com/Algorithms/Documents/Sman/Volume/ExternalSorting.html
↑ The first and last iterations do read and write the entire file.
↑ http://bluehawk.monmouth.edu/rclayton/web-pages/s06-503/esort.html

External links

[1] The two input drives are throttled by the output drive's speed. They cannot provide data faster than the output drive can write it.

[Knuth1973-2] Donald Knuth, The Art of Computer Programming, Volume 3, Addison Wesley, 1973, Algorithm 5.4.2D.

[3] ttp://oopweb.com/Algorithms/Documents/Sman/Volume/ExternalSorting.html

[4] The first and last iterations do read and write the entire file.

[5] ttp://bluehawk.monmouth.edu/rclayton/web-pages/s06-503/esort.html

[1]

[2]

[3]

[4]

[5]

v t e Sorting algorithms
Theory	Computational complexity theory Big O notation Total order Lists Inplacement Stability Comparison sort Adaptive sort Sorting network Integer sorting X + Y sorting Transdichotomous model Quantum sort
Exchange sorts	Bubble sort Cocktail shaker sort Odd–even sort Comb sort Gnome sort Quicksort Stooge sort Bogosort
Selection sorts	Selection sort Heapsort Smoothsort Cartesian tree sort Tournament sort Cycle sort
Insertion sorts	Insertion sort Shellsort Splaysort Tree sort Library sort Patience sorting
Merge sorts	Merge sort Cascade merge sort Oscillating merge sort Polyphase merge sort
Distribution sorts	American flag sort Bead sort Bucket sort Burstsort Counting sort Pigeonhole sort Proxmap sort Radix sort Flashsort
Concurrent sorts	Bitonic sorter Batcher odd–even mergesort Pairwise sorting network
Hybrid sorts	Block sort Timsort Introsort Spreadsort
Other	Topological sorting Pancake sorting Spaghetti sort

Polyphase merge sort

Contents

Ordinary merge sort

Polyphase merge

Perfect 3 file polyphase merge sort

Distribution for polyphase merge sort

Comparison versus ordinary merge sort

References

Further reading

External links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools