本文へジャンプ

To make machines' ears and mouth intelligent

AKAGI Laboratory
Professor:AKAGI Masato

E-mail:
[Research areas]
Speech Information Processing, Speech Recognition/Synthesis
[Keywords]
Speech Perception/Production, Affective speech, Emotion, Singing voice, Speaker individuality

Skills and background we are looking for in prospective students

In studies, we use digital signal processing as a tool. For this reason, knowledge of the fundamental differential / integral and linear algebra (Course Title " Fundamental Mathematics for Information Science") is necessary. In addition, "I like sounds" is an important factor.

What you can expect to learn in this laboratory

In Akagi Lab, we consider production and perception mechanism of speech as a mathematical model, and we are constructing a useful speech signal processing system from a mathematical model by using the method of digital signal processing. To do this, it is necessary to perform physiological / psychological observations on utterances of human utterances, mathematical modeling based on observation results, and implementation of a system that moves like humans on a computer. By thinking about these series processes, we can measure, analyze, transform and synthesize not only speech but also the time waveforms generated in the universal world.

【Job category of graduates】
Telephone company, manufacturing industry (car manufacturer, audio equipment maker), etc.

Research outline

<Basic Concept>

Akagi lab. is mainly focus¬ing on the topics indicated by the red blocks in Fig. 1 (e.g., speech production, speech communication in real environments, speech perception). Speech production and percep¬tion are human’s activities. Thus, we study knowledge on speech production and percep¬tion as human’s activities and construct useful models for advanced sound processing systems.

<Research Areas>

We are studying the part of the red frames in Fig. 1 (Producing: speech production, speech transmission in the real environment, Hearing: speech perception). For this purpose, we are engaged in research in cooperation with fields such as medicine, physiology, psychology, phonetics physics and speech science, as well as engineering (digital signal processing).

<Research Topics>

Figure 2 shows research topics we are doing and have done.
- Production: Investigating the relationship between speech spectrum and shape of vocal tract, synthesizing natural speech with non-linguistic information (e.g., individuality and emotion), and synthesizing singing voice
- Perception: Recognize speech in real-world conditions, realizing cock-tail-party ef¬fect, to enhance speech, and constructing models of effective speech


Fig. 1 Principle procedures of speech communication (production and perception).

Fig. 2 Research topics in Akagi lab.

Key publications

  1. Y. Xue, Y. Hamada, and M. Akagi, “Voice conversion for emotional speech: Rule-based synthesis with degree of emotion controllable in dimensional space,” Speech Communication Volume 102, 54-67, 2018.
  2. Y. Li, J. Li, and M. Akagi. “Contributions of the glottal source and vocal tract cues to emotional vowel perception in the valence-arousal space,” J. Acoust. Soc. Am. 144 (2), 908–916, 2018.
  3. X. Li and M. Akagi, “A Three-Layer Emotion Perception Model for Valence and Arousal-Based Detection from Multilingual Speech,” Proc. InterSpeech2018, Hyderabad, India, 3643-3647, 2018.

Equipment

Soundproof room, anechoic room, AV room etc.

Teaching policy

Akagi laboratory welcomes students to bring study themes. While discussing with the supervisor, we will nurture the study that each student wants to do for a Master or a PhD studies. It is convinced that each student is responsible for progress and can be accomplished only because it is the theme that he/she proposes.

[Website] URL:http://www.ais.jaist.ac.jp/index-e.html

PageTop