Star Quality: Aggregating Reviews to Rank Products and Merchants
Given a set of reviews of products or merchants from a wide range of authors and several reviews websites, how can we measure the true quality of the product or merchant? How do we remove the bias of individual au- thors or sources? How do we compare reviews obtained from different websites, where ratings may be on differ- ent scales (1-5 stars, A/B/C, etc.)? How do we filter out unreliable reviews to use only the ones with "star qual- ity"? Taking into account these considerations, we an- alyze data sets from a variety of different reviews sites (the first paper, to our knowledge, to do this). These data sets include 8 million product reviews and 1.5 million merchant reviews. We explore statistic- and heuristic- based models for estimating the true quality of a prod- uct or merchant, and compare the performance of these estimators on the task of ranking pairs of objects. We also apply the same models to the task of using Netflix ratings data to rank pairs of movies, and discover that the performance of the different models is surprisingly similar on this data set.