Academic
Publications
On Convergence Properties of the EM Algorithm for Gaussian Mixtures

On Convergence Properties of the EM Algorithm for Gaussian Mixtures,10.1162/neco.1996.8.1.129,Neural Computation,Lei Xu,Michael I. Jordan

On Convergence Properties of the EM Algorithm for Gaussian Mixtures   (Citations: 239)
BibTex | RIS | RefWorks Download
We build up the mathematical connection between the Expectation-Maximization (EM) algorithm and gradient-based approaches for maximum likelihood learning of finite gaussian mixtures. We show that the EM step in parameter space is obtained from the gradient via a projection matrix P, and we provide an explicit expression for the matrix. We then analyze the convergence of EM in terms of special properties of P and provide new results analyzing the effect that P has on the likelihood surface. Based on these mathematical results, we present a comparative discussion of the advantages and disadvantages of EM and other algorithms for the learning of gaussian mixture models.
Journal: Neural Computation - NECO , vol. 8, no. 1, pp. 129-151, 1996
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Recent theoretical investigations on the asymptotic convergence properties of the EM algorithms for gaussian mixtures and the mixtures of densities from exponential families (Xu & Jordan, 1996; Xu, 1997; Ma, Xu, & Jordan, 2000; Ma & Xu, 2005; Ma & Fu, 2005) provide a new view on the EM algorithm for ME...

    Yan Yanget al. Asymptotic Convergence Properties of the EM Algorithm for Mixture of E...

    • ...Parameter learning is usually implemented under the maximum likelihood principle by an expectation-maximization (EM) algorithm [2,6,7]...
    • ...The first level is the inference of the component labels y. The second is the estimation of the parameters Θ given a fixed k ,w hich is usually implemented by maximizing the likelihood q(XN |Θ )= N=1 q(xt|Θ) with the help of the following well known EM algorithm [2,6,7]:...

    Lei Shiet al. Learning Gaussian mixture with automatic model selection: A comparativ...

    • ...It is shown in [6], that the EM algorithms are first-order optimization algorithms, which provides a theoretical explanation for the slow convergence speeds...

    Xudong Ma. Large-Scale Clustering Based on Data Compression

    • ...The computational complexity for convex-NMF is of order n2p+m(2n2k+nk2) for Eq. (25)...
    • ...proved to be non-increasing, J(�(t+1)) ≤ J(�(t)). Following Xu & Jordan (25), we expand1...
    • ...have a first-order convergence rate, which is the same as the E M algorithm (25)...
    • ...presented in Eqs.(26) and (25) depend on XTX only...

    Chris H. Q. Dinget al. Convex and Semi-Nonnegative Matrix Factorizations

    • ...For the example in Fig. 1(b), the task of getting an optimal classifier p(j|x) is a subtask of estimating the parameters Θj = {μj, Σj ,α j} if they are unknown too, which is usually referred as parameter learning [7,8]...
    • ...For Gaussian mixture introduced in Fig. 1(b), we start from the standard EM algorithm for the ML learning [7,8], which is here used as a benchmark for better insights on the Bayesian Ying-Yang harmony learning and other related algorithms...

    Lei Xu. Bayesian Ying-Yang system, best harmony learning, and five action circ...

Sort by: