An emerging research area named Learning-to-Rank (LtR) has shown that effective solutions to the ranking problem can leverage machine learning techniques applied to a large set of features capturing the relevance of a candidate document for the user query. Large-scale search systems must however answer user queries very fast, and the computation of the features for candidate documents must comply with strict back-end latency constraints. The number of features cannot thus grow beyond a given limit, and Feature Selection (FS) techniques have to be exploited to find a subset of features that both meets latency requirements and leads to high effectiveness of the trained models. In this paper, we propose three new algorithms for FS specifically designed for the LtR context where hundreds of continuous or categorical features can be involved. We present a comprehensive experimental analysis conducted on publicly available LtR datasets and we show that the proposed strategies outperform a well-known state-of-theart competitor.

Fast feature selection for learning to rank

LUCCHESE, Claudio;
2016-01-01

Abstract

An emerging research area named Learning-to-Rank (LtR) has shown that effective solutions to the ranking problem can leverage machine learning techniques applied to a large set of features capturing the relevance of a candidate document for the user query. Large-scale search systems must however answer user queries very fast, and the computation of the features for candidate documents must comply with strict back-end latency constraints. The number of features cannot thus grow beyond a given limit, and Feature Selection (FS) techniques have to be exploited to find a subset of features that both meets latency requirements and leads to high effectiveness of the trained models. In this paper, we propose three new algorithms for FS specifically designed for the LtR context where hundreds of continuous or categorical features can be involved. We present a comprehensive experimental analysis conducted on publicly available LtR datasets and we show that the proposed strategies outperform a well-known state-of-theart competitor.
2016
ICTIR 2016 - Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/3692224
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 17
  • ???jsp.display-item.citation.isi??? ND
social impact