Communication using speech production and speech perception is one of the basic ways for human to exchange information. Fully understanding such mechanisms of human and realizing them by a computer are the research goal of our laboratory. To do so, we are carrying out of the following research topics.
Speech Cognitive Science
Speech cognition (perception) can be considered as an inverse procedure of the speech production. Since numbers of articulatory situations are able to produce the same sound, there is one-to-many inverse problem occurring in the cognition processing, which is a crucial topic in speech cognition. We are going to challenge the problem by investigating its causes, which are concerned with the stability of the articulatory situation, and the physiological and morphological constraints, via the physiological articulatory model.
Speech Production Mechanisms and Their Modeling
There are still a number of unsolved questions on mechanisms of speech production, especially for production of emotional speech. To answer those questions, we used a physiological articulatory model, which has been developed based on MRI data by this Lab and ATR, to simulate the processing from articulatory target to speech sound and the inverse processing from speech sound to articulatory target. The ``true'' mechanisms can be approached using such an iterative approach. An additional part of this topic is to refine the articulatory model based on physiological discoveries.
Speech Communication within The Brain
According to the motor theory of speech perception, a famous hypothesis, speech perception is realizing with reference to image or knowledge of the motor (production) areas (Liberman et al., 1960, 1985). In this research, we are going to verify this theory by investigating interaction between speech perception and production via acoustic analysis, EMG measurement and articulatory observation.
Speech Synthesis with Specific Individuality and Emotion
Individuality of speech depends on physiological (inborn) factors and social (habit-forming) factors. In this study, we focus on the analysis and modeling of the effects of the former factors on speech. Emotion is the paralinguistic information to describe a state of the speaker, which cannot be logically produced. The study is trying to study emotional speech generation by adapting our experience to the articulatory model and clarify the relation between the emotion and acoustic parameters besides the fundamental frequency.
Speech Recognition Considering Auditory, Articulatory and Physiological Features
We are going to develop some novel methods for speech recognition by considering human mechanisms. We are using human auditory property for developing a robust speech recognition method for a noisy environment, coarticulatory mechanism for missing speech recognition, and physiological features for speaker identification.