Automatic Classification of Queries by Expected Retrieval Performance - Avignon Université Accéder directement au contenu
Communication Dans Un Congrès Année : 2005

Automatic Classification of Queries by Expected Retrieval Performance

Résumé

This paper presents a method for automatically predicting a degree of average relevance of a retrieved document set returned by a retrieval system in response to a query. For a given retrieval system and document collection, prediction is conceived as query classification. Two classes of queries have been defined: easy and hard. The split point between those two classes is the median value of the average precision over the query collection. This paper proposes several classifiers that select useful features among a set of candidates and use them to predict the class of a query. Classifiers are trained on the results of the systems involved in the TREC 8 campaign. Due to the limited number of available queries, training and test are performed with the leave-one-out and 10-fold cross-validation methods. Two types of classifiers, namely decision trees and support vector machines provide particularly interesting results for a number of systems. A fairly high classification accuracy is obtained using the TREC 8 data (more than 80% of correct prediction in some settings).
Fichier principal
Vignette du fichier
sigir2005-qp (1).pdf (460.88 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02171688 , version 1 (03-07-2019)

Identifiants

  • HAL Id : hal-02171688 , version 1

Citer

Renato De, Jens Grivolla, Pierre Jourlin, Renato de Mori. Automatic Classification of Queries by Expected Retrieval Performance. ACM SIGIR 2005 Workshop on Predicting Query Difficulty – Methods and Applications (2005), Aug 2005, Salvador, Brazil. ⟨hal-02171688⟩

Collections

UNIV-AVIGNON LIA
45 Consultations
29 Téléchargements

Partager

Gmail Facebook X LinkedIn More