Apostolico–Giancarlo algorithm

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

In computer science, the Apostolico–Giancarlo algorithm is a variant of the Boyer–Moore string search algorithm, the basic application of which is searching for occurrences of a pattern P in a text T. As with other comparison-based string searches, this is done by aligning P to a certain index of T and checking whether a match occurs at that index. P is then shifted relative to T according to the rules of the Boyer-Moore algorithm, and the process repeats until the end of T has been reached. Application of the Boyer-Moore shift rules often results in large chunks of the text being skipped entirely.

With regard to the shift operation, Apostolico-Giancarlo is exactly equivalent in functionality to Boyer-Moore. The utility of Apostolico-Giancarlo is to speed up the match-checking operation at any index. With Boyer-Moore, finding an occurrence of P in T requires that all n characters of P be explicitly matched. For certain patterns and texts, this is very inefficient - a simple example is when both pattern and text consist of the same repeated character, in which case Boyer-Moore runs in O(n m) where m is the length in characters of T. Apostolico-Giancarlo speeds this up by recording the number of characters matched at the alignments of T in a table, which is combined with data gathered during the pre-processing of P to avoid redundant equality checking for sequences of characters that are known to match.

References

  • Apostolico A., Giancarlo R., 1986, The Boyer-Moore-Galil string searching strategies revisited, SIAM Journal on Computing 15(1):98-105.
  • Crochemore, M., Lecroq, T., 1997, Tight bounds on the complexity of the Apostolico–Giancarlo algorithm, Information Processing Letters 63(4):195-203.
  • Crochemore, M., Rytter, W., 1994, Text Algorithms, Oxford University Press.
  • Gusfield, D., 1997, Algorithms on strings, trees, and sequences: Computer Science and Computational Biology, Cambridge University Press.
  • Lecroq, T., 1992, Recherches de mot, Ph. D. Thesis, University of Orléans, France.
  • Lecroq, T., 1995, Experimental results on string matching algorithms, Software - Practice & Experience 25(7):727-765.