University of Leicester
Browse

LZ78 compression in low main memory space

Download (341.92 kB)
conference contribution
posted on 2018-05-02, 16:31 authored by Diego Arroyuelo, Rodrigo Cánovas, Gonzalo Navarro, Rajeev Raman
We present the first algorithms that perform the LZ78 compression of a text of length n over alphabet [1..σ], whose output is z integers, using only O(z lg σ) bits of main memory. The algorithms read the input text from disk in a single pass, and write the compressed output to disk. The text can also be decompressed within the same main memory usage, which is unprecedented too. The algorithms are based on hashing and, under some simplifying assumptions, run in O(n) expected time. We experimentally verify that our algorithms use 2–9 times less time and/or space than previously implemented LZ78 compressors.

Funding

This collaboration started during the Dagstuhl Seminar 16431, “Computation over Compressed Structured Data”. We also acknowledge the funding from Millennium Nucleus Information and Coordination in Networks ICM/FIC RC130003 (G.N.).

History

Citation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, 10508 LNCS, pp. 38-50

Author affiliation

/Organisation/COLLEGE OF SCIENCE AND ENGINEERING/Department of Informatics

Source

International Symposium on String Processing and Information Retrieval SPIRE 2017: String Processing and Information Retrieval

Version

  • AM (Accepted Manuscript)

Published in

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Publisher

Springer Verlag (Germany)

issn

0302-9743

eissn

1611-3349

isbn

9783319674278

Copyright date

2017

Available date

2018-05-02

Publisher version

https://link.springer.com/chapter/10.1007/978-3-319-67428-5_4

Language

en

Usage metrics

    University of Leicester Publications

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC