Academic
Publications
A dictionary based approach for gene annotation

A dictionary based approach for gene annotation,10.1145/299432.299504,Lior Pachter,Serafim Batzoglouti,Valentin I. Spitkovsky,William S. Beebee Jr.,Er

A dictionary based approach for gene annotation   (Citations: 17)
BibTex | RIS | RefWorks Download
This paper describes a fast and fully automated dic- tionary based approach to gene annotation and exon prediction. Two dictionaries are constructed, one from the nonredundant protein OWL database and the other from the dbEST database. These dictionaries are used to obtain O(1) time lookups of tuples in the dictionar- ies (4 tuples for the OWL database and 11 tuples for the dbEST database). These tuples can be used to rapidly find the longest matches at every position in an input sequence to the database sequences. Such matches pro- vide very useful information pertaining to locating com- mon segments between exons, alternative splice sites, and frequency data of long tuples for statistical pur- poses. These dictionaries also provide the basis for both homology determination, and statistical approaches to exon prediction. For instance, using the OWL protein database on a benchmark test set of 130 genes, and af- ter removing sequences from the database with exact amino acid homology to genes in our test set, we find 88% of coding nucleotides, and 99% of our predictions of coding nucleotides are correct. Also, 81% of coding ex- ons are predicted exactly, while 82% of our predictions of exons agree exactly with the published annotation of their genes.
Conference: Research in Computational Molecular Biology - RECOMB , vol. 6, no. 3/4, pp. 285-294, 1999
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...If one can refer to an actual dictionary (Pachter et al., 1999; Laub and Smith, 1998), other questions can be asked since each...

    Daniel P. Aalbertset al. Quantifying optimal accuracy of local primary sequence bioinformatics ...

    • ...Methods in this group, include CRASA (Chuang et al., 2003), AAT (Huang et al., 1997), GALA (Bailey et al., 1998), ICE (Pachter et al., 1999) and so on. Finally, combination tools such as GenomeScan (Yeh et al., 2001), GeneWise (Birney and Durbin, 2000), Procrustes (Gelfand et al., 1996; Sze and Pevzner, 1997; Mironov et al., 1998), FGENESH+ and FGENESH++ (Salamov and Solovyev, 2000) and GrailEXP_Gawain (Hyatt et al., 2000) combine ab ...

    Trees-juen Chuanget al. A comparative method for identification of gene structures and alterna...

    • ...Successful implementation of this method includes AAT (Huang et al. 1997), FGENESH+ and FGENESH++ (Salamov and Solovyev 2000), GAIA (Bailey et al. 1998), Gene-Builder (Milanesi et al. 1999), GenomeScan (Yeh et al. 2001), GrailEXP_Gawain (Hyatt et al. 2000), GeneWise (Birney and Durbin 2000), ICE (Pachter et al. 1999), and Procrustes (Gelfand et al. 1996; Sze and Pevzner 1997; Mironov et al. 1998)...

    Trees-Juen Chuanget al. A Complexity Reduction Algorithm for Analysis and Annotation of Large ...

    • ...Gene prediction in human genome often amounts to using related proteins from other species as clues for finding exon‐intron structures (Gelfand et al., 1996; Pachter et al., 1999; Birney et al., 1996)...

    Abdullah N. Arslanet al. A new approach to sequence comparison: normalized sequence alignment

    • ...The technique lies at the heart of all the successful gene é nding programs to date (for a discussion of DP applications to gene é nding see Salzburg [1998]; a partial list of users of DP is Burge and Karlin [1997], Henderson et al. [1997], Kulp et al. [1996], Pachter et al. [1999], and Wirth [1988])...
    • ...Current methods for incorporating DNA and protein alignments into homology-based gene é nders (Krogh, 2000; Kulp et al., 1996; Pachter et al., 1999) treat the alignments as separate evidence to be used for reweighting candidate gene annotations, rather than tackling the two problems jointly...

    Lior Pachteret al. Applications of generalized pair hidden Markov models to alignment and...

Sort by: