Measuring the Reusability of Test Collections

Measuring the Reusability of Test Collections,Ben Carterette,Evgeniy Gabrilovich,Vanja Josifovski,Donald Metzler

Measuring the Reusability of Test Collections  
BibTex | RIS | RefWorks Download
While test collection construction is a time-consuming and expensive process, the true cost is amortized by reusing the collection over hundreds or thousands of experiments. Some of these experiments may involve systems that retrieve doc- uments not judged during the initial construction phase, and some of these systems may be \hard" to evaluate: depend- ing on which judgments are missing and which judged doc- uments were retrieved, the experimenter's condence in an evaluation could potentially be very low. We propose two methods for quantifying the reusability of a test collection for evaluating new systems. The proposed methods provide simple yet highly eective tests for determining whether an existing set of judgments is useful for evaluating a new sys- tem. Empirical evaluations using TREC datasets conrm the usefulness of our proposed reusability measures. In par- ticular, we show that our methods can reliably estimate con- dence intervals that are indicative of collection reusability.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.