posted on 2013-07-11, 12:05authored byPhilip Bille, Gad M. Landau, Rajeev Raman, Kunihiko Sadakane, Srinivasa Rao Satti, Oren Weimann
Let S be a string of length N compressed into a context-free grammar S of size n. We present two representations of S achieving O(logN) random access time, and either O(n · α[subscript k](n)) construction time and space on the pointer machine model, or 0(n) construction time and space on the RAM. Here, α[subscript k](n) is the inverse of the k[superscript th] row of Ackermann's function. Our representations also efficiently support decompression of any substring in S: we can decompress any substring of length m in the same complexity as a single random access query and additional O(m) time. Combining these results with fast algorithms for uncompressed approximate string matching leads to several efficient algorithms for approximate string matching on grammar-compressed strings without decompression. For instance, we can find all approximate occurrences of a pattern P with at most k errors in time O(n(min{|P|k,k[superscript 4] + |P|}+logN)+occ), where occ is the number of occurrences of P in S. Finally, we are able to generalize our results to navigation and other operations on grammar-compressed trees. All of the above bounds significantly improve the currently best known results. To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data structure, two "biased" weighted ancestor data structures, and a compact representation of heavy-paths in grammars.
History
Citation
Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, 2011, pp. 373-389
Author affiliation
/Organisation/COLLEGE OF SCIENCE AND ENGINEERING/Department of Computer Science
Source
The Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2011), San Francisco, California
Version
VoR (Version of Record)
Published in
Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms
Publisher
Society for Industrial and Applied Mathematics (SIAM)