Speech Recognition under Stress Condition

Speech Recognition under Stress Condition,Sumitra Shukla,S. R. Mahadeva Prasanna,S. Dandapat

Speech Recognition under Stress Condition  
BibTex | RIS | RefWorks Download
The objective of this work is to conduct a speech recognition study and evaluate the performance of the same under stressed condition. The speech recognition study is conducted both in isolated word recognition and keyword spotting approaches. The word models are built during training using speech collected from neutral condition. During testing these models are tested with speech signals collected under different stressed conditions to quantify the amount of degradation in each stress condition. It is observed that the performance of the speech recognition system decreases significantly under stressed condition. I. INTRODUCTION Speech is a complex signal which encodes message as well as paralinguistic information like speaker, emotion, acoustic environment, person's intention, language, accent and dialect (1). Stress refers to the psychological state of the person due to internally induced factors like emotions or externally induced factors like Lombard effect. In human-human interaction, listener can recognize message as well as paralinguistic aspects present in the speech. At the same time the listener can also effortlessly extract only wanted information from the speech and neglect the rest by what is called selective attention. This is not understood well to mimic the same in human- computer interaction. Hence in the case of human-computer interaction, the performance of the system degrades because of the inability of the system to deemphasize the paralinguistic information. For instance, under stress the speech production varies with respect to neutral condition due to the constriction of various muscle structures present in the speech production system. This leads to the change in the characteristic of speech signal compared to the neutral condition. Identification of stress and properly compensating the same will give signif- icant improvement in the performance of speech or speaker recognition systems. For this it is better to quantify the amount of degradation that will be caused due to the stressed condition. The present work deals with the quantification of degradation in the performance of speech recognition system under stressed condition. Most of the earlier attempts in the stressed speech process- ing area focused on how to classify and compensate different stress conditions. To find the effect of stress in speech, researchers have studied the effects of stress at sentence, word and sound unit levels (1). In these study they have analyzed percentage deviation of duration, intensity, glottal pulse shap- ing and vocal tract spectrum (1). In some of the studies, speech recognizer was trained with neutral speech and during testing the effect of stress was compensated (2). Under this condition, compensation techniques used for such analysis are formant location and bandwidth stress equalization (3), (4), (5), whole word cepstral compensation (6), slope-dependent weighting (7), formant shifting (8), source-generator based codebook stress compensation (9), (10), source-generator based adaptive cepstral compensation (11), (10). The purpose of these studies are to improve the performance of speech recognition system. All these studies are based on the fact that under stressed condition the performance of the speech recognition system degrades, but not exactly to quantify how much degradation takes place. Even though it is a known fact that under stressed condition performance of the system degrades, it may be better to have first hand quantification of amount of degradation. Such quantification will help in the following ways: We will understand the exact amount of degradation caused by each stress condition. Accordingly methods may be developed to compensate the stress for each condition. This is the motiva- tion for the present work. In this study, we quantify the effect of stress in an automatic
Published in 2009.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.