We present an experiment evaluating the contribution of a system called GReG for reranking the snippets returned by Google’s search engine in the 10 best links presented to the user and captured by the use of Google’s API. The evaluation aims at establishing whether or not the introduction of deep linguistic information may improve the accuracy of Google or rather it is the opposite case as maintained by the majority of people working in Information Retrieval and using a Bag Of Words approach. We used 900 questions and answers taken from TREC 8 and 9 competitions and execute three different types of evaluation: one without any linguistic aid; a second one with tagging and syntactic constituency contribution; another run with what we call Partial Logical Form. Even though GReG is still work in progress, it is possible to draw clearcut conclusions: adding linguistic information to the evaluation process of the best snippet that can answer a question improves enormously the performance. In another experiment we used the actual texts associated to the Q/A pairs distributed by one of TREC’s participant and got even higher accuracy.

Linguistically-based Reranking of Google’s Snippets with GReG

DELMONTE, Rodolfo;TRIPODI, ROCCO
2010-01-01

Abstract

We present an experiment evaluating the contribution of a system called GReG for reranking the snippets returned by Google’s search engine in the 10 best links presented to the user and captured by the use of Google’s API. The evaluation aims at establishing whether or not the introduction of deep linguistic information may improve the accuracy of Google or rather it is the opposite case as maintained by the majority of people working in Information Retrieval and using a Bag Of Words approach. We used 900 questions and answers taken from TREC 8 and 9 competitions and execute three different types of evaluation: one without any linguistic aid; a second one with tagging and syntactic constituency contribution; another run with what we call Partial Logical Form. Even though GReG is still work in progress, it is possible to draw clearcut conclusions: adding linguistic information to the evaluation process of the best snippet that can answer a question improves enormously the performance. In another experiment we used the actual texts associated to the Q/A pairs distributed by one of TREC’s participant and got even higher accuracy.
Proceedings of DART2010, 4th International Workshop on Distributed Agent-based Retrieval Tools
File in questo prodotto:
File Dimensione Formato  
proceedingsdart2010.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Accesso libero (no vincoli)
Dimensione 5.32 MB
Formato Adobe PDF
5.32 MB Adobe PDF Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/29841
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact