Top Page
Masato Akagi, Mitsunori Mizumachi, Yuichi Ishimoto, and Masashi Unoki,
"Speech Enhancement and Segregation based on Human Auditory Mechanisms,"
In Proc. of 2001 International Conference on Information Society in the 21st Century (IS2000), pp. 246-254, Aizu-Wakamatsu, Japan, Oct. 2000.
Last modified:
2 June 2001
Abstract

This paper introduces models of speech enhancement and segregation based on knowledge about human psychoacoustics and auditory physiology. The cancellation model is used for enhancing speech. Special attention is paid to reducing noise by using a spatial filtering technique, and increasing technique. Both techniques adopt concepts of the cancellation model. In addition, some constraints related to the heuristic regularities proposed by Bregman are used to overcome the problem associated with segregating two acoustic sources. Simulated results show that both spatial and frequency filtering are useful in enhancing speech. As a result, these filtering methods can be used effectively at the front-end of automatic speech recognition system, and for speech feature extraction. The sound segregation model can precisely extract a desired signal from a noisy signal even in waveforms.

Keywords: cancellation model, noise reduction, microphone array, F0 extraction, computational auditory scene analysis

Created by M. Unoki, 14 April 2001