Academic
Publications
Morpho-syntactic post-processing of N-best lists for improved French automatic speech recognition

Morpho-syntactic post-processing of N-best lists for improved French automatic speech recognition,10.1016/j.csl.2009.10.001,Computer Speech & Language

Morpho-syntactic post-processing of N-best lists for improved French automatic speech recognition   (Citations: 1)
BibTex | RIS | RefWorks Download
Many automatic speech recognition (ASR) systems rely on the sole pronunciation dictionaries and language models to take into account information about language. Implicitly, morphology and syntax are to a certain extent embedded in the language models but the richness of such linguistic knowledge is not exploited. This paper studies the use of morpho-syntactic (MS) information in a post-processing stage of an ASR system, by reordering N-best lists. Each sentence hypothesis is first part-of-speech tagged. A morpho-syntactic score is computed over the tag sequence with a long-span language model and combined to the acoustic and word-level language model scores. This new sentence-level score is finally used to rescore N-best lists by reranking or consensus. Experiments on a French broadcast news task show that morpho-syntactic knowledge improves the word error rate and confidence measures. In particular, it was observed that the errors corrected are not only agreement errors and errors on short grammatical words but also other errors on lexical words where the hypothesized lemma was modified.
Journal: Computer Speech & Language , vol. 24, no. 4, pp. 663-684, 2010
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Among existing work on CMs improvement, some propose to estimate the confidence of a word, directly as its a posteriori probability, given low-level (acoustic) observations [14, 16]...
    • ...ASR confidence measures are derived from N-best lists, using a posteriori sentence probabilities obtained by the combination of an acoustic score, a linguistic score provided by a 4-gram language model (LM) and a morpho-syntactic score given by a 7-gram part-of-speech (POS) model [14]...
    • ...Transcripts are tagged with a set of 144 POS classes containing general morphosyntactic classes as well as very frequent words [14]...
    • ...Confidence measures are provided based on posterior probabilities combining acoustic, language model and POS scores as in [14]...

    Julien Fayolleet al. Reshaping automatic speech transcripts for robust high-level spoken do...

Sort by: