posted on 2023-06-16, 10:08authored byD Arroyuelo, R Cánovas, J Fischer, D Köppl, M Löbel, G Navarro, R Raman
The Lempel-Ziv 78 (LZ78) and Lempel-Ziv-Welch (LZW) text factorizations are popular, not only for bare compression but also for building compressed data structures on top of them. Their regular factor structure makes them computable within space bounded by the compressed output size. In this article, we carry out the first thorough study of low-memory LZ78 and LZW text factorization algorithms, introducing more efficient alternatives to the classical methods, as well as new techniques that can run within less memory space than the necessary to hold the compressed file. Our results build on hash-based representations of tries that may have independent interest.
Funding
ANID—Millennium Science Initiative Program—Code ICN17_002 (D.A. and G.N.), and from JSPS KAKENHI with grant numbers JP18F18120 and JP21K17701
History
Citation
Diego Arroyuelo, Rodrigo Cánovas, Johannes Fischer, Dominik Köppl, Marvin Löbel, Gonzalo Navarro, and Rajeev Raman. 2021. Engineering Practical Lempel-Ziv Tries. ACM J. Exp. Algorithmics 26, Article 14 (December 2021), 47 pages. https://doi.org/10.1145/3481638