Internship Presentation

Research Background

Emotional expression in speech plays a vital role in communication. Previous studies using Japanese noise-vocoded speech suggested that temporal amplitude envelopes provide crucial perceptual cues for urgency. However, their effect on Mandarin speech remains unclear.

Research Objective

This study investigates whether temporal amplitude envelope cues in Mandarin speech affect emotion perception, using both original and noise-vocoded samples.

Methodology

  1. Generate noise-vocoded speech from normalized audio samples.
  2. Conduct listening experiments with 5 native Mandarin speakers in a soundproof room.
  3. Participants label 50 samples each from original and vocoded speech.
Noise-Vocoded Speech Generation Diagram
Figure 5: Schematic diagram of noise-vocoded speech synthesis

Experimental Results

Original Speech Emotion Accuracy
Figure 1: Accuracy of emotion recognition in original speech
Vocoded Speech Emotion Accuracy
Figure 2: Accuracy of emotion recognition in noise-vocoded speech
Original Speech Emotion Confusion
Figure 3: Emotion classification tendencies in original speech
Vocoded Speech Emotion Confusion
Figure 4: Emotion classification tendencies in vocoded speech

Conclusion

Reference

Unoki, Y., Kawamura, M., Kitani, S., Kobayashi, M., & Akagi, M. (2019). “Examination of amplitude envelope cues contributing to perception of urgency in speech.” Acoustical Society of Japan, Noise & Vibration Committee.

Back to Home