Sign in
Author
|
Conference
|
Journal
|
Organization
|
Year
|
DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all fields of study
Limit my searches in the following fields of study
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(6)
Language Model
Speech Recognition
Hidden Markov Model
Large Vocabulary Continuous Speech Recognition
Out of Vocabulary
Word Error Rate
Subscribe
Academic
Publications
Morpheme concatenation approach in language modeling for large-vocabulary Uyghur speech recognition
Morpheme concatenation approach in language modeling for large-vocabulary Uyghur speech recognition,10.1109/ICSDA.2011.6085990,Mijit Ablimit,Askar Ham
Edit
Morpheme concatenation approach in language modeling for large-vocabulary Uyghur speech recognition
BibTex
|
RIS
|
RefWorks
Download
Mijit Ablimit
,
Askar Hamdulla
,
Tatsuya Kawahara
For large-vocabulary
continuous speech recognition
(LVCSR) of highly-inflected languages, selection of an appropriate recognition unit is the first important step. The morpheme-based approach is often adopted because of its high coverage and linguistic properties. But morpheme units are short, often consisting of one or two phonemes, thus they are more likely to be confused in ASR than word units. Generally, word units provide better linguistic constraint, but increases the vocabulary size explosively, causing OOV (out-of-vocabulary) and data sparseness problems in language modeling. In this research, we investigate approaches of selecting word entries by concatenating morpheme sequences, which would reduce
word error rate
(WER). Specifically, we compare the ASR results of the word-based model and those of the morpheme-based model, and extract typical patterns which would reduce the WER. This method has been successfully applied to an Uyghur LVCSR system, resulting in a significant reduction of WER without a drastic increase of the vocabulary size.
Conference:
Oriental COCOSDA International Conference on Speech Database and Assessments - Oriental COCOSDA
, 2011
DOI:
10.1109/ICSDA.2011.6085990
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
ieeexplore.ieee.org
)
(
ieeexplore.ieee.org
)
References
(9)
Uyghur Morpheme-based Language Models and ASR
(
Citations: 1
)
Mijit Ablimit
,
Graham Neubig
,
Masato Mimura
,
Shinsuke Mori
,
Tatsuya Kawahara
,
Askar Hamdulla
,
I. UYGHUR
,
MORPHOLOGICAL UNITS
Conference:
International Conference on Signal Processing Proceedings - ICSP
, 2010
Partly Supervised Uighur Morpheme Segmentation
(
Citations: 3
)
Mijit Ablimit
,
Mihrigul Eli
,
Tatsuya Kawahara
Joint Morphological-Lexical Language Modeling (JMLLM) for Arabic
(
Citations: 4
)
R. Sarikaya
,
M. Afify
,
Y. Gao
Conference:
International Conference on Acoustics, Speech, and Signal Processing - ICASSP
, vol. 4, pp. IV-181-IV-184, 2007
Morphology-based and sub-word language modeling for Turkish speech recognition
(
Citations: 2
)
Hasim Sak
,
Murat Saraclar
,
Tunga Güngör
Conference:
International Conference on Acoustics, Speech, and Signal Processing - ICASSP
, pp. 5402-5405, 2010
Korean large vocabulary continuous speech recognition with morpheme-based recognition units
(
Citations: 33
)
Oh-wook Kwon
,
Jun Park
Journal:
Speech Communication
, vol. 39, no. 3-4, pp. 287-300, 2003