Linguistically-based Reranking of Google’s Snippets with GReG

Delmonte, Rodolfo; Tripodi, Rocco

We present an experiment evaluating the contribution of a system called GReG for reranking the snippets returned by Google’s search engine in the 10 best links presented to the user and captured by the use of Google’s API. The evaluation aims at establishing whether or not the introduction of deep linguistic information may improve the accuracy of Google or rather it is the opposite case as maintained by the majority of people working in Information Retrieval and using a Bag Of Words approach. We used 900 questions and answers taken from TREC 8 and 9 competitions and execute three different types of evaluation: one without any linguistic aid; a second one with tagging and syntactic constituency contribution; another run with what we call Partial Logical Form. Even though GReG is still work in progress, it is possible to draw clearcut conclusions: adding linguistic information to the evaluation process of the best snippet that can answer a question improves enormously the performance. In another experiment we used the actual texts associated to the Q/A pairs distributed by one of TREC’s participant and got even higher accuracy.