We propose a new representation of the offsets of the Lempel-Ziv (LZ) factorization based on the co-lexicographic order of the text's prefixes. The selected offsets tend to approach the k-th order empirical entropy. Our evaluations show that this choice is superior to the rightmost and bit-optimal LZ parsings on datasets with small high-order entropy.
HOLZ: High-Order Entropy Encoding of Lempel-Ziv Factor Distances
Prezza N.
2022-01-01
Abstract
We propose a new representation of the offsets of the Lempel-Ziv (LZ) factorization based on the co-lexicographic order of the text's prefixes. The selected offsets tend to approach the k-th order empirical entropy. Our evaluations show that this choice is superior to the rightmost and bit-optimal LZ parsings on datasets with small high-order entropy.File in questo prodotto:
File | Dimensione | Formato | |
---|---|---|---|
High_Order_LZ77.pdf
non disponibili
Tipologia:
Documento in Pre-print
Licenza:
Accesso chiuso-personale
Dimensione
428.36 kB
Formato
Adobe PDF
|
428.36 kB | Adobe PDF | Visualizza/Apri |
I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.