The Lempel-Ziv factorization (LZ77) and the Run-Length encoded Burrows-Wheeler Transform (RLBWT) are two important tools in text compression and indexing, being their sizes z and r closely related to the amount of text self-repetitiveness. In this paper we consider the problem of converting the two representations into each other within a working space proportional to the input and the output. Let n be the text length. We show that RLBW T can be converted to LZ77 in O(n log r) time and O(r) words of working space. Conversely, we provide an algorithm to convert LZ77 to RLBW T in O n(log r + log z) time and O(r + z) words of working space. Note that r and z can be constant if the text is highly repetitive, and our algorithms can operate with (up to) exponentially less space than naive solutions based on full decompression.
|Data di pubblicazione:||2017|
|Titolo:||From LZ77 to the run-length encoded burrows-wheeler transform, and back|
|Titolo del libro:||28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, Leibniz International Proceedings in Informatics, LIPIcs|
|Digital Object Identifier (DOI):||http://dx.doi.org/10.4230/LIPIcs.CPM.2017.17|
|Appare nelle tipologie:||4.1 Articolo in Atti di convegno|
File in questo prodotto: