An index on a finite-state automaton is a data structure able to locate specific patterns on the automaton’s paths and consequently on the regular language accepted by the automaton itself. Cotumaccio and Prezza [SODA ’21], introduced a data structure able to solve pattern matching queries on automata, generalizing the famous FM-index for strings of Ferragina and Manzini [FOCS ’00]. The efficiency of their index depends on the width of a particular partial order of the automaton’s states, the smaller the width of the partial order, the faster is the index. However, computing the partial order of minimal width is NP-hard. This problem was mitigated by Cotumaccio [DCC ’22], who relaxed the conditions on the partial order, allowing it to be a partial preorder. This relaxation yields the existence of a unique partial preorder of minimal width that can be computed in polynomial time. In the paper at hand, we present a new class of partial preorders and show that they have the following useful properties: (i) they can be computed in polynomial time, (ii) their width is never larger than the width of Cotumaccio’s preorders, and (iii) there exist infinite classes of automata on which the width of Cotumaccio’s preorder is linearly larger than the width of our preorder.

Indexing Finite-State Automata Using Forward-Stable Partitions

Becker, Ruben;Kim, Sung-Hwan;Prezza, Nicola;Tosoni, Carlo
2024-01-01

Abstract

An index on a finite-state automaton is a data structure able to locate specific patterns on the automaton’s paths and consequently on the regular language accepted by the automaton itself. Cotumaccio and Prezza [SODA ’21], introduced a data structure able to solve pattern matching queries on automata, generalizing the famous FM-index for strings of Ferragina and Manzini [FOCS ’00]. The efficiency of their index depends on the width of a particular partial order of the automaton’s states, the smaller the width of the partial order, the faster is the index. However, computing the partial order of minimal width is NP-hard. This problem was mitigated by Cotumaccio [DCC ’22], who relaxed the conditions on the partial order, allowing it to be a partial preorder. This relaxation yields the existence of a unique partial preorder of minimal width that can be computed in polynomial time. In the paper at hand, we present a new class of partial preorders and show that they have the following useful properties: (i) they can be computed in polynomial time, (ii) their width is never larger than the width of Cotumaccio’s preorders, and (iii) there exist infinite classes of automata on which the width of Cotumaccio’s preorder is linearly larger than the width of our preorder.
2024
Proceedings of the 31st International Symposium on String Processing and Information Retrieval (SPIRE 2024)
File in questo prodotto:
File Dimensione Formato  
indexing_forward_stable.pdf

non disponibili

Descrizione: full text
Tipologia: Documento in Pre-print
Licenza: Copyright dell'editore
Dimensione 280.81 kB
Formato Adobe PDF
280.81 kB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5076321
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact