We propose a new representation of the offsets of the Lempel-Ziv (LZ) factorization based on the co-lexicographic order of the text's prefixes. The selected offsets tend to approach the k-th order empirical entropy. Our evaluations show that this choice is superior to the rightmost and bit-optimal LZ parsings on datasets with small high-order entropy.

HOLZ: High-Order Entropy Encoding of Lempel-Ziv Factor Distances

Prezza N.
2022

Abstract

We propose a new representation of the offsets of the Lempel-Ziv (LZ) factorization based on the co-lexicographic order of the text's prefixes. The selected offsets tend to approach the k-th order empirical entropy. Our evaluations show that this choice is superior to the rightmost and bit-optimal LZ parsings on datasets with small high-order entropy.
2022 Data Compression Conference (DCC)
File in questo prodotto:
File Dimensione Formato  
High_Order_LZ77.pdf

non disponibili

Tipologia: Documento in Pre-print
Licenza: Accesso chiuso-personale
Dimensione 428.36 kB
Formato Adobe PDF
428.36 kB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/10278/5004035
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact