Academic
Publications
Proteome coverage prediction with infinite Markov models

Proteome coverage prediction with infinite Markov models,10.1093/bioinformatics/btp233,Bioinformatics/computer Applications in The Biosciences,Manfred

Proteome coverage prediction with infinite Markov models   (Citations: 5)
BibTex | RIS | RefWorks Download
Motivation: Liquid chromatography tandem mass spectrometry (LC-MS/MS) is the predominant method to comprehensively characterize complex protein mixtures such as samples from prefractionated or complete proteomes. In order to maximize proteome coverage for the studied sample, i.e. identify as many traceable proteins as possible, LC-MS/MS experiments are typically repeated extensively and the results combined. Proteome coverage prediction is the task of estimating the number of peptide discoveries of future LC-MS/MS experiments. Proteome coverage prediction is important to enhance the design of efficient proteomics studies. To date, there does not exist any method to reliably estimate the increase of proteome coverage at an early stage. Results: We propose an extended infinite Markov model DiriSim to extrapolate the progression of proteome coverage based on a small number of already performed LC-MS/MS experiments. The method explicitly accounts for the uncertainty of peptide identifications. We tested DiriSim on a set of 37 LC-MS/MS experiments of a complete proteome sample and demonstrated that DiriSim correctly predicts the coverage progression already from a small subset of experiments. The predicted progression enabled us to specify maximal coverage for the test sample. We demonstrated that quality requirements on the final proteome map impose an upper bound on the number of useful experiment repetitions and limit the achievable proteome coverage.
Journal: Bioinformatics/computer Applications in The Biosciences - BIOINFORMATICS , vol. 25, no. 12, pp. i154-i160, 2009
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Schmidt et al, 2008) together with charge state and gas phase fractionation...

    Martin Becket al. The quantitative proteome of a human cell line

    • ...Recently, an infinite Markov model based on Dirichlet processes [5] has been proposed to characterize LC-MS/MS experiments and for the first time to predict proteome coverage for one dimensional fractionation experiments [6]...
    • ...In this paper we generalize the non-parametric approach to characterize peptide distributions arising in LC-MS/MS experiments [6] to further enable proteome coverage prediction from integrated datasets compiled from multidimensional fractionation experiments...
    • ...Following [6] we assume that the precedent peptide π i−1 := j is indicative for the current polarity of the chromatography and thereby the current peptide distribution, i.e...
    • ...with a slight abuse of notation we assume G i = G i . Further we assume a Dirichlet process prior for G i, resulting in an infinite Markov model for LC-MS/MS experiments similar to [6]...
    • ...The a i reflect the dissimilarity of fraction i from the other fractions by controlling the rate of sampling peptides directly from the root distribution G. We account for their distinguished role by putting prior weight α i on a i and incorporating this parameter by assuming for the a i a biased (in the sense of [6]) Dirichlet process prior DPi(γ i a ,α i ,M ) with uniform base measure M := (1/I)1..I .I n the following, we will refer to ...
    • ...The step from the simple base measureG as described in [6] to the self-referential base measure enables to appropriately characterize the peptide distributions describing such an experiment...
    • ...Furthermore it has been observed that false positive peptide-spectrum matches distribute in a uniformlike manner across the protein database [6,10]...
    • ...To account for false positive peptide-spectrum matches we adaptively estimate parameters and we adaptively sample novel peptide identifications as described in [6]...
    • ...We compared to a recent approach designed for (one dimensional) LC-MS/MS experiments [6] and to ad hoc extrapolation methods...
    • ...This approach conceptionally extends methods exclusively suited for single fraction experiments [6], by introducing self-referential base measures that accommodate similarities among different experiment fractions...
    • ...Efficient study design will help to save costly experiments, contribute to the reliability of the final set of protein discoveries [6,10] and furthermore enhance subsequent directed/targeted proteomics studies [19,20]...

    Manfred Claassenet al. Proteome Coverage Prediction for Integrated Proteomics Datasets

Sort by: