Early Exit Strategies for Learning-to-Rank Cascades

The ranking pipelines of modern search platforms commonly exploit complex machine-learned models, having a significant impact on the query response time. In this paper, we discuss several techniques to speed up the document scoring process based on large ensembles of decision trees without hindering ranking quality. Specifically, we study the problem of document early exit within a framework of a cascading ranker made of three components: (i) an efficient but sub-optimal ranking stage; (ii) a pruner that exploits signals from the previous component to force the early exit of documents classified as irrelevant; (iii) a final high-quality component aimed at finely ranking the documents that survived the previous phase. To maximize speedup and preserve effectiveness, we aim to increase the accuracy of the pruner in identifying irrelevant documents without early exiting documents that are likely to be ranked among the final top-k results. We propose an in-depth study of heuristic and machine-learned techniques for designing the pruner. While the heuristic technique only exploits the score/ranking information supplied by the first sub-optimal ranker, the machine-learned solution named LEAR exploits these signals as additional features along those representing query-document pairs. Moreover, we study alternative solutions to implement the first ranker, either a small prefix of the original forest, or an auxiliary machine-learned ranker, explicitly trained for the purpose. We evaluate our techniques with reproducible experiments conducted using publicly available datasets and state-of-the-art competitors. The experiments confirm that our early-exit strategies achieve speedups ranging from 3× to 10× without statistically significant differences in effectiveness.

Early Exit Strategies for Learning-to-Rank Cascades

Busolin, Francesco;Lucchese, Claudio;Nardini, Franco Maria;Orlando, Salvatore;Perego, Raffaele;Trani, Salvatore

2023

Abstract

The ranking pipelines of modern search platforms commonly exploit complex machine-learned models, having a significant impact on the query response time. In this paper, we discuss several techniques to speed up the document scoring process based on large ensembles of decision trees without hindering ranking quality. Specifically, we study the problem of document early exit within a framework of a cascading ranker made of three components: (i) an efficient but sub-optimal ranking stage; (ii) a pruner that exploits signals from the previous component to force the early exit of documents classified as irrelevant; (iii) a final high-quality component aimed at finely ranking the documents that survived the previous phase. To maximize speedup and preserve effectiveness, we aim to increase the accuracy of the pruner in identifying irrelevant documents without early exiting documents that are likely to be ranked among the final top-k results. We propose an in-depth study of heuristic and machine-learned techniques for designing the pruner. While the heuristic technique only exploits the score/ranking information supplied by the first sub-optimal ranker, the machine-learned solution named LEAR exploits these signals as additional features along those representing query-document pairs. Moreover, we study alternative solutions to implement the first ranker, either a small prefix of the original forest, or an auxiliary machine-learned ranker, explicitly trained for the purpose. We evaluate our techniques with reproducible experiments conducted using publicly available datasets and state-of-the-art competitors. The experiments confirm that our early-exit strategies achieve speedups ranging from 3× to 10× without statistically significant differences in effectiveness.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2023
			
	Titolo della Rivista
	
				IEEE ACCESS
			
	N° Volume
	
				Online
			
	DOI
	
				https://dx.doi.org/10.1109/ACCESS.2023.3331088
			
	Appare nelle tipologie:
	
				2.1 Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Early_Exit_Strategies_for_Learning-to-Rank_Cascades.pdf accesso aperto Tipologia: Versione dell'editore Licenza: Creative commons Dimensione 1.84 MB Formato Adobe PDF Visualizza/Apri	1.84 MB	Adobe PDF	Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5043380

Citazioni

ND

ND

ND

social impact