Academic
Publications
Evaluation Methods for Topic Models

Evaluation Methods for Topic Models,10.1145/1553374.1553515,Hanna M. Wallach,Iain Murray,Ruslan Salakhutdinov,David M. Mimno

Evaluation Methods for Topic Models   (Citations: 15)
BibTex | RIS | RefWorks Download
A natural evaluation metric for statistical topic models is the probability of held-out documents given a trained model. While exact computation of this probability is in- tractable, several estimators for this prob- ability have been used in the topic model- ing literature, including the harmonic mean method and empirical likelihood method. In this paper, we demonstrate experimentally that commonly-used methods are unlikely to accurately estimate the probability of held- out documents, and propose two alternative methods that are both accurate and ecient. In this paper we consider only the simplest topic model, latent Dirichlet allocation (LDA), and compare a number of methods for estimating the probability of held-out documents given a trained model. Most of the methods presented, however, are applicable to more complicated topic models. In addition to com- paring evaluation methods that are currently used in the topic modeling literature, we propose several al- ternative methods. We present empirical results on synthetic and real-world data sets showing that the currently-used estimators are less accurate and have higher variance than the proposed new estimators.
Conference: International Conference on Machine Learning - ICML , pp. 1105-1112, 2009
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Efficiently and reliably computing the marginal likelihood in topic models is an active research area [37], [38] due to the intractable sum over correlated y in (17)...
    • ...We take the view of [37], [38] and define an importance sampling approximation to the marginal likelihood...
    • ...The new importance sampling proposal in (19) results in much faster and more accurate estimation of the marginal likelihood (17) than the standard approach [39], [37] of using posterior Gibbs samples...
    • ...The latter results in the harmonic mean approximation for the likelihood [39], [37] and suffers from 1) requiring a (slow) Gibbs sampler at test time (prohibiting online computation) and 2) the high variance of the harmonic mean estimator (making classification inaccurate)...
    • ...We compared this against the classification performance of WS-JTM using the commonly used harmonic mean likelihood approximation [37], [39]...

    Timothy M. Hospedaleset al. Identifying Rare and Subtle Behaviors: A Weakly Supervised Joint Topic...

    • ...However, Wallach et al. have pointed out in [3] that when sampling from high-dimensional distributions the variance of empirical likelihood may be very large...

    Shaoze Leiet al. Topic model with constrainted word burstiness intensities

    • ...Evaluation Plan. Evaluating topic models is a dicult task, due to a lack of ground truth [16]...

    Stephen W. Thomas. Mining software repositories using topic models

    • ...We point out that reliably computing the marginal likelihood in topic models is an active research area [16, 17] due to the intractable sum over correlated y in Eq. (5)...
    • ...We take the view of [16, 17] and define an importance sampling approximation p(x ∗ |c) ≈ 1...
    • ...Note that this importance sampling proposal (Eq. (7)) is much faster and more accurate than the standard approach [16, 18] of using posterior Gibbs samples in Eq. (6), which results in the unstable harmonic mean approximation to the likelihood – an estimator which has huge variance in theory and in practice [16]...
    • ...Note that this importance sampling proposal (Eq. (7)) is much faster and more accurate than the standard approach [16, 18] of using posterior Gibbs samples in Eq. (6), which results in the unstable harmonic mean approximation to the likelihood – an estimator which has huge variance in theory and in practice [16]...

    Jian Liet al. Learning Rare Behaviours

    • ...Wallach et al. [5] gives a evalua­ tion method of this generative ability using Latent Dirichlet Allocation (LDA)...

    Yonghui Wuet al. A comparative study of topic models for topic clustering of Chinese we...

Sort by: