The Bijective Burrows-Wheeler Transform (BBWT) is a variant of the famous BWT [Burrows and Wheeler, 1994]. The BBWT was introduced by Gil and Scott in 2012, and is based on the extended BWT of Mantaci et al. [TCS 2007] and on the Lyndon factorization of the input string. In the original paper, the compression achieved with the BBWT was shown to be competitive with that of the BWT, and it has been gaining interest in recent years. In this work, we present the first study of the number rB of runs of the BBWT, which is a measure of its compression power. We exhibit an infinite family of strings on which rB of the string and of its reverse differ by a multiplicative factor of Θ(logn), where n is the length of the string. We also give several theoretical results on the BBWT, including a characterization of binary strings for which the BBWT has two runs. Finally, we present experimental results and statistics on rB(s) and rB(srev), as well as on the number of Lyndon factors in the Lyndon factorization of s and srev.
On the number of equal-letter runs of the bijective Burrows-Wheeler transform
Cenzato, Davide;
2024-01-01
Abstract
The Bijective Burrows-Wheeler Transform (BBWT) is a variant of the famous BWT [Burrows and Wheeler, 1994]. The BBWT was introduced by Gil and Scott in 2012, and is based on the extended BWT of Mantaci et al. [TCS 2007] and on the Lyndon factorization of the input string. In the original paper, the compression achieved with the BBWT was shown to be competitive with that of the BWT, and it has been gaining interest in recent years. In this work, we present the first study of the number rB of runs of the BBWT, which is a measure of its compression power. We exhibit an infinite family of strings on which rB of the string and of its reverse differ by a multiplicative factor of Θ(logn), where n is the length of the string. We also give several theoretical results on the BBWT, including a characterization of binary strings for which the BBWT has two runs. Finally, we present experimental results and statistics on rB(s) and rB(srev), as well as on the number of Lyndon factors in the Lyndon factorization of s and srev.I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.