Early Exit Strategies for Approximate k-NN Search in Dense Retrieval

Learned dense representations are a popular family of techniques for encoding queries and documents using high-dimensional em- beddings, which enable retrieval by performing approximate 𝑘 nearest-neighbors search (A-𝑘NN). A popular technique for mak- ing A-𝑘NN search efficient is based on a two-level index, where the embeddings of documents are clustered offline and, at query processing, a fixed number 𝑁 of clusters closest to the query is visited exhaustively to compute the result set. In this paper, we build upon state-of-the-art for early exit A- 𝑘NN and propose an unsupervised method based on the notion of patience, which can reach competitive effectiveness with large efficiency gains. Moreover, we discuss a cascade approach where we first identify queries that find their nearest neighbor within the closest 𝜏 ≪𝑁 clusters, and then we decide how many more to visit based on our patience approach or other state-of-the-art strate- gies. Reproducible experiments employing state-of-the-art dense retrieval models and publicly available resources show that our techniques improve the A-𝑘NN efficiency with up to 5×speedups while achieving negligible effectiveness losses. All the code used is available at https://github.com/francescobusolin/faiss_pEE.

Early Exit Strategies for Approximate k-NN Search in Dense Retrieval

Francesco Busolin;Claudio Lucchese;Franco Maria Nardini;Salvatore Orlando;Raffaele Perego;Salvatore Trani

2024-01-01

Abstract

Learned dense representations are a popular family of techniques for encoding queries and documents using high-dimensional em- beddings, which enable retrieval by performing approximate 𝑘 nearest-neighbors search (A-𝑘NN). A popular technique for mak- ing A-𝑘NN search efficient is based on a two-level index, where the embeddings of documents are clustered offline and, at query processing, a fixed number 𝑁 of clusters closest to the query is visited exhaustively to compute the result set. In this paper, we build upon state-of-the-art for early exit A- 𝑘NN and propose an unsupervised method based on the notion of patience, which can reach competitive effectiveness with large efficiency gains. Moreover, we discuss a cascade approach where we first identify queries that find their nearest neighbor within the closest 𝜏 ≪𝑁 clusters, and then we decide how many more to visit based on our patience approach or other state-of-the-art strate- gies. Reproducible experiments employing state-of-the-art dense retrieval models and publicly available resources show that our techniques improve the A-𝑘NN efficiency with up to 5×speedups while achieving negligible effectiveness losses. All the code used is available at https://github.com/francescobusolin/faiss_pEE.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2024
			
	Titolo del volume
	
				CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
			
	DOI
	
				https://dx.doi.org/10.1145/3627673.3679903
			
	Appare nelle tipologie:
	
				4.1 Articolo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
3627673.3679903.pdf accesso aperto Tipologia: Versione dell'editore Licenza: Creative commons Dimensione 994.38 kB Formato Adobe PDF Visualizza/Apri	994.38 kB	Adobe PDF	Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5081787

Citazioni

ND

ND

ND

social impact