Masato AKAGI's Home Page 本文へジャンプ
PAPERS                       (31/03/2009)
1. Non-linguistic Information
1-1 Singing Voice
1-2 Speaker Individuality
1-3 Emotional Speech
1-4 Voice Conversion
1-5 Speech Coding

2. Noise Reduction
2-1 Microphone Array
2-2 F0 Extraction
2-3 De-reverberation
2-4 Bone-conducted Speech
2-5 Speech Recognition
2-6 DOA

3. Cocktail-party Effect Modeling
3-1 Sound Segregation
3-2 Privacy Protection
3-3 Noisy Sound Perception

4. Psychoacoustics
4-1 Auditory Model
4-2 Contextual Effect
4-3 Auditory Filter
4-4 Phase Perception
4-5 Vowel Perception
4-6 Noise Evaluation

5. Physiological Auditory Modeling

6. Abnormal Speech
6-1 Abnormal Speech Perception
6-2 3D Vocal Tract Modeling

7. Interaction between Perception and Production

8. Others
8-1 NTT & ATR
8-2 Tokyo Institute of Technology






If you are interested in my research topics and want to see my publications, why don't you visit the JAIST Repositry and download my papers?





1. Non-linguistic Information
  1. Akagi, M. (2009/02/20). "Introduction of SCOPE project: Analysis of production and perception characteristics of non-linguistic information in speech and its application to inter-language communications," International symposium on biomechanical and physiological modeling and speech science, 51-62.
1-1 Singing Voice
  1. Nakamura, T., Kitamura, T. and Akagi, M. (2009/03/01). "A study on nonlinguistic feature in singing and speaking voices by brain activity measurement," Proc. NCSP'09, 217-220.
  2. Saitou, T., Goto, M., Unoku, M., and Akagi, M. (2007). "Speech-to-singing synthesis: converting speaking voices to singing voices by controlling acoustic features unique to singing voices," Proc. WASPAA2007, New Paltz, NY, pp.215-218
  3. Saitou, T., Goto, M., Unoki, M., and Akagi, M. (2007). "Vocal conversion from speaking voice to singing voice using STRAIGHT," Proc. Interspeech2007, Singing Challenge.
  4. Saitou, T., Unoki, M., and Akagi, M. (2006). "Analysis of acoustic features affecting singing-voice perception and its application to singing-voice synthesis from speaking-voice using STRAIGHT," J. Acoust. Soc. Am., 120, 5, Pt. 2, 3029.
  5. Saitou, T., Unoki, M. and Akagi, M. (2005). "Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis," Speech Communication 46, 405-417.
  6. Saitou, T., Tsuji, N., Unoki, M. and Akagi, M. (2004). “Analysis of acoustic features affecting “singing-ness” and its application to singing-voice synthesis from speaking-voice,” Proc. ICSLP2004, Cheju, Korea.
  7. Saitou, T., Unoki, M., and Akagi, M. (2004). “Control methods of acoustic parameters for singing-voice synthesis,” Proc. ICA2004, 501-504.
  8. Saitou, T., Unoki, M., and Akagi, M. (2004). “Development of the F0 control method for singing-voices synthesis,” Proc. SP2004, Nara, 491-494.
  9. Akagi, M. (2002). "Perception of fundamental frequency fluctuation," HEA-02-003-IP, Forum Acousticum Sevilla 2002 (Invited).
  10. Saitou, T., Unoki, M., and Akagi, M. (2002). "Extraction of F0 dynamic characteristics and development of F0 control model in singing voice," Proc. ICAD2002, Kyoto.
  11. Unoki, M., Saitou, T., and Akagi, M. (2002). "Effect of F0 fluctuations and development of F0 control model in singing voice perception," NATO Advanced Study Institute 2002 Dynamics of Speech Production and Perception.
  12. Akagi, M. and Kitakaze, H. (2000). "Perception of synthesized singing voices with fine fluctuations in their fundamental frequency contours," Proc. ICSLP2000, Beijing, III-458-461.
1-2 Speaker Individuality
  1. Akagi, M. and Ienaga, T. (1997). "Speaker individuality in fundamental frequency contours and its control", J. Acoust. Soc. Jpn. (E), 18, 2 73-80.
  2. Kitamura, T. and Akagi, M. (1996). "Relationship between physical characteristics and speaker individualities in speech spectral envelopes", Proc ASA-ASJ Joint Meeting, 833-838.
  3. Akagi, M. and Ienaga, T. (1995). "Speaker individualities in fundamental frequency contours and its control", Proc. EUROSPEECH95, 439-442.
  4. Kitamura, T. and Akagi, M. (1995). "Speaker individualities in speech spectral envelopes", J. Acoust. Soc. Jpn. (E), 16, 5, 283-289.
  5. Kitamura, T. and Akagi, M. (1994). "Speaker Individualities in speech spectral envelopes", Proc. Int. Conf. Spoken Lang. Process. 94, 1183-1186.
1-3 Emotional Speech
  1. Aoki, Y., Huang, C-F., and Akagi, M. (2009/03/01). "An emotional speech recognition system based on multi-layer emotional speech perception model," Proc. NCSP'09, 133-136.
  2. Huang, C. F. and Akagi, M. (2008/10) "A three-layered model for expressive speech perception," Speech Communication 50, 810-828.
  3. Huang, C. F., Erickson, D., and Akagi, M. (2008/07/01). "Comparison of Japanese expressive speech perception by Japanese and Taiwanese listeners," Acoustics2008, Paris, 2317-2322.
  4. Huang, C. F. and Akagi, M. (2007). "A rule-based speech morphing for verifying an expressive speech perception model," Proc. Interspeech2007, 2661-2664.
  5. Sawamura K., Dang J., Akagi M., Erickson D., Li, A., Sakuraba, K., Minematsu, N., and Hirose, K. (2007). "Common factors in emotion perception among different cultures," Proc. ICPhS2007, 2113-2116.
  6. Huang, C. F. and Akagi, M. (2007). "The building and verification of a three-layered model for expressive speech perception," Proc. JCA2007, CD-ROM.
  7. Huang, C. F. and Akagi, M. (2005). "Toward a rule-based synthesis of emotional speech on linguistic description of perception," Affective Computing and Intelligent Interaction, Springer LNCS 3784, 366-373.
  8. Huang, C. F. and Akagi, M. (2005). "A Multi-Layer fuzzy logical model for emotional speech Perception," Proc. EuroSpeech2005, Lisbon, Portugal, 417-420.
  9. Ito, S., Dang, J., and Akagi, M. (2004). “Investigation of the acoustic features of emotional speech using physiological articulatory model,” Proc. ICA2004, 2225-2226.
1-4 Voice Conversion
  1. Nguyen, B. P. and Akagi, M. (2009/02/20). "Applications of Temporal Decomposition to Voice Transformation," International symposium on biomechanical and physiological modeling and speech science, 19-24.
  2. Nguyen, B. P., Shibata, T., and Akagi, M. (2008/09/24). "High-quality analysis/synthesis method based on Temporal decomposition for speech modification," Proc. InterSpeech2008, Brisbane, 662-665.
  3. Nguyen B. P. and Akagi M. (2008/6/6). "Phoneme-based spectral voice conversion using temporal decomposition and Gaussian mixture model," Proc. ICCE2008, 224-229.
  4. Nguyen B. P. and Akagi M. (2008/3/7). "Control of spectral dynamics using temporal decomposition in voice conversion and concatenative speech synthesis," Proc. NCSP08, 279-282.
  5. Shibata, T. and Akagi, M. (2008/3/6). "A study on voice conversion method for synthesizing stimuli to perform gender perception experiments of speech," Proc. NCSP08, 180-183.
  6. Nguyen B. P. and Akagi M. (2007). "A flexible spectral modification method based on temporal decomposition and Gaussian mixture model," Proc. Interspeech2007, 538-541.
  7. Nguyen B. P. and Akagi M. (2007). "Spectral Modification for Voice Gender Conversion using Temporal Decomposition," Journal of Signal Processing, 11, 4, 333-336.
  8. Akagi, M., Saitou, T., and Huang, C. F. (2007). "Voice conversion to add non-linguistic information into speaking voices," Proc. JCA2007, CD-ROM.
  9. Nguyen B. P. and Akagi M. (2007). "Spectral Modification for Voice Gender Conversion using Temporal Decomposition," Proc. NCSP2007, 481-484.
  10. Takeyama, Y., Unoki, M., Akagi, M., and Kaminuma, A. (2006). "Synthesis of mimic speech sounds uttered in noisy car environments," Proc. NCSP2006, 118-121.
1-5 Speech Analysis and Coding
  1. Tomoike, S. and Akagi, M. (2008). "Estimation of local peaks based on particle filter in adverse environments," Journal of Signal Processing, 12, 4, 303-306.
  2. Tomoike, S. and Akagi, M. (2008/3/7). "Estimation of local peaks based on particle filter in adverse environments," Proc. NCSP08, 391-394.
  3. Nguyen, P. C., Akagi, M., and Nguyen, P. B. (2007). "Limited error based event localizing temporal decomposition and its application to variable-rate speech coding," Speech Communication, 49, 292-304.
  4. Akagi, M., Nguyen, P. C., Saitou, T., Tsuji, N., and Unoki, M. (2004). “Temporal decomposition of speech and its application to speech coding and modification,” Proc. KEST2004, 280-288.
  5. Akagi, M. and Nguyen, P. C. (2004). “Temporal decomposition of speech and its application to speech coding and modification,” Proc. Special Workshop in MAUI (SWIM), 1-4, 2004.
  6. Nguyen, P. C. and Akagi, M. (2003). “Efficient quantization of speech excitation parameters using temporal decomposition,” Proc. EUROSPEECH2003, Geneva, 449-452.
  7. Nguyen, P. C., Akagi, M., and Ho, T. B. (2003). "Temporal decomposition: A promising approach to VQ-based speaker identification," Proc. ICME2003, Baltimore, V.III, 617-620.
  8. Nguyen, P. C., Akagi, M., and Ho, T. B. (2003). "Temporal decomposition: A promising approach to VQ-based speaker identification," Proc. ICASSP2003, Hong Kong, I-184-187.
  9. Nguyen, P. C., Ochi, T., and Akagi, M. (2003). “Modified Restricted Temporal Decomposition and its Application of Low Rate Speech Coding,” IEICE Trans. Inf. & Syst., E86-D, 3, 397-405.
  10. Nguyen, P. C. and Akagi, M. (2002). "Variable rate speech coding using STRAIGHT and temporal decomposition," Proc. SCW2002, Tsukuba, 26-28.
  11. Nguyen, P. C. and Akagi, M. (2002). "Coding speech at very low rates using STRAIGHT and temporal decomposition," Proc. ICSLP2002, Denver, 1849-1852.
  12. Nguyen, P. C. and Akagi, M. (2002). "Limited error based event localizing temporal decomposition," Proc. EUSIPCO2002, Toulouse, 190.
  13. Nguyen, P. C. and Akagi, M. (2002). "Improvement of the restricted temporal decomposition method for line spectral frequency parameters," Proc. ICASSP2002, Orlando, I-265-268.
  14. Nandasena, A. C. R., Nguyen, P. C. and Akagi, M. (2001). " Spectral stability based event localizing temporal decomposition", Computer Speech & Language, Vol. 15, No. 4, 381-401
  15. Nandasena, A.C.R. and Akagi, M. (1998). “Spectral stability based event localizing temporal decomposition,” Proc. ICASSP98, II, 957-960

2. Noise Reduction

2-1 Microphone Array
  1. Li, J., Jiang, H., and Akagi, M. (2008/09/23). "Psychoacoustically-motivated adaptive β-order generalized spectral subtraction based on data-driven optimization," Proc. InterSpeech2008, Brisbane, 171-174.
  2. Li, J., Sakamoto, S., Hongo, S., Akagi, M., and Suzuki, Y. (2008). “Adaptive b-order generalized spectral subtraction for speech enhancement,” Signal Processing, vol. 88, no. 11, pp. 2764-2776, 2008.
  3. Li, J., Sakamoto, S., Hongo, S., Akagi, M., and Suzuki, Y. (2008/08/16). "Improved two-stage binaural speech enhancement based on accurate interference estimation for hearing aids," IHCON2008
  4. Li, J., Sakamoto, S., Hongo, S., Akagi, M., and Suzuki, Y. (2008/0630). "A two-stage binaural speech enhancement approach for hearing aids with preserving binaural benefits in noisy environments," Acoustics2008, Paris, 723-727.
  5. Li, J., Akagi, M., and Suzuki, Y. (2008). "A two-microphone noise reduction method in highly non-stationary multiple-noise-source environments," IEICE Trans. Fundamentals, E91-A, 6, 1337-1346.
  6. Li, J. and Akagi, M. (2008). "A hybrid microphone array post-filter in a diffuse noise field," Applied Acoustics 69, 546-557.
  7. Li, J., Sakamoto, S., Hongo, S., Akagi, M., and Suzuki, Y. (2007). "A speech enhancement approach for binaural hearing aids," Proc. 22th SIP Symposium, Sendai, 263-268.
  8. Li, J., Sakamoto, S., Hongo, S., Akagi, M., and Suzuki, Y. (2007). "Noise reduction based on adaptive beta-order generalized spectral subtraction for speech enhancement," Proc. Interspeech2007, 802-805.
  9. Li, J., Akagi, M., and Suzuki, Y. (2006). "Multi-channel noise reduction in noisy environments," Chinese Spoken Language Processing, Proc. ISCSLP2006, Springer LNCS 4274, 258-269.
  10. Li, J., Akagi, M., and Suzuki, Y. (2006). "Noise reduction based on microphone array and post-filtering for robust speech recognition," Proc. ICSP, Guilin.
  11. Li, J. and Akagi, M. (2006). "Noise reduction method based on generalized subtractive beamformer," Acoust. Sci. & Tech., 27, 4, 206-215.
  12. Li, J, Akagi, M., and Suzuki, Y. (2006). "Improved hybrid microphone array post-filter by integrating a robust speech absence probability estimator for speech enhancement," Proc. ICSLP2006, Pittsburgh, USA, 2130-2133.
  13. Li, J. and Akagi, M. (2006). "A noise reduction system based on hybrid noise estimation technique and post-filtering in arbitrary noise environments," Speech Communication, 48, 111-126.
  14. Li, J., Akagi, M., and Suzuki, Y. (2006). "Noise reduction based on generalized subtractive beamformer for speech enhancement," WESPAC2006, Seoul
  15. Li, J. and Akagi, M. (2005). "Theoretical analysis of microphone arrays with postfiltering for coherent and incoherent noise suppression in noisy environments," Proc. IWAENC2005, Eindhoven, The Netherlands, 85-88.
  16. Li, J. and Akagi, M. (2005). "A hybrid microphone array post-filter in a diffuse noise field," Proc. EuroSpeech2005, Lisbon, Portugal, 2313-2316.
  17. Li, J., Lu, X., and Akagi, M. (2005). "Noise reduction based on microphone array and post-filtering for robust speech recognition in car environments," Proc. Workshop DSPinCar2005, S2-9
  18. Li, J., Lu, X., and Akagi, M. (2005). “A noise reduction system in arbitrary noise environments and its application to speech enhancement and speech recognition,” Proc. ICASSP2005, Philadelphia, III-277-280.
  19. Li, J. and Akagi, M. (2005). “Suppressing localized and non-localized noises in arbitrary noise environments,” Proc. HSCMA2005, Piscataway.
  20. Li, J. and Akagi, M. (2004). “Noise reduction using hybrid noise estimation technique and post-filtering,” Proc. ICSLP2004, Cheju, Korea.
  21. Akagi, M. and Kago, T. (2002). " Noise reduction using a small-scale microphone array in multi noise source environment," Proc. ICASSP2002, Orlando, I-909-912.
  22. Mizumachi, M., Akagi, M. and Nakamura, S. (2000). "Design of robust subtractive beamformer for noisy speech recognition," Proc. ICSLP2000, Beijing, IV-57-60.
  23. Mizumachi, M. and Akagi, M. (2000). "Noise reduction using a small-scale microphone array under non-stationary signal conditions," Proc. WESTPRAC7, 421-424.
  24. Mizumachi, M. and Akagi, M. (1999). "Noise reduction method that is equipped for robust direction finder in adverse environments," Proc. Workshop on Robust Methods for Speech Recognition in Adverse Conditions, Tampere, Finland, 179-182.
  25. Mizumachi, M. and Akagi, M. (1998). “Noise reduction by paired-microphones using spectral subtraction,” Proc. ICASSP98, II, 1001-1004
  26. Akagi, M. and Mizumachi, M. (1997). "Noise Reduction by Paired Microphones", Proc. EUROSPEECH97, 335-338.
2-2 F0 Extraction
  1. Ishimoto, Y., Akagi, M., Ishizuka, K., and Aikawa, K. (2004). “Fundamental frequency estimation for noisy speech using entropy-weighted periodic and harmonic features,” IEICE Trans. Inf. & Syst., E87-D, 1, 205-214.
  2. Ishimoto, Y., Unoki, M., and Akagi, M. (2001). "A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency", Proc. EUROSPEECH2001, Aalborg, 2439-2442.
  3. Ishimoto, Y., Unoki, M., and Akagi, M. (2001). "A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency ", Proc. CRAC, Aalborg.
  4. Ishimoto, Y., Unoki, M., and Akagi, M. (2001). "A fundamental frequency estimation method for noisy speech based on periodicity and harmonicity", Proc. ICASSP2001, SPEECH-SF3, Salt Lake City.
  5. Ishimoto, Y. and Akagi, M. (2000). "A fundamental frequency estimation method for noisy speech," Proc. WESTPRAC7, 161-164.
2-3 De-reverberation
  1. Petric, R., Lu, X., Unoki, M., Akagi, M., and Hoffmann, R. (2008/09/24). "Robust front end processing for speech recognition in reverberant environments: Utilization of speech characteristics," Proc. InterSpeech2008, Brisbane, 658-661.
  2. Uniki, M., Toi, M., Shibano, Y., and Akagi, M. (2006). "Suppression of speech intelligibility loss through a modulation transfer function-based speech dereverberation method," J. Acoust. Soc. Am., 120, 5, Pt. 2, 3360.
  3. Unoki, M., Toi, M., and Akagi, M. (2006). "Refinement of an MTF-based speech dereverberation method using an optimal inverse-MTF filter," SPECOM2006, St. Petersburg, 323-326.
  4. Unoki, M., Toi, M., and Akagi, M. (2005). “Development of the MTF-based speech dereverberation method using adaptive time-frequency division,” Proc. Forum Acousticum 2005, 51-56.
  5. Toi, M., Unoki, M. and Akagi, M. (2005). “Development of adaptive time-frequency divisions and a carrier reconstruction in the MTF-based speech dereverberation method,” Proc. NCSP05, Hawaii, 355-358.
  6. Unoki, M., M., Sakata, Furukawa, K. and Akagi, M. (2004). “A speech dereverberation method based on the MTF concept in power envelope restoration,” Acoust. Sci. & Tech., 25, 4, 243-254.
  7. Unoki, M., Furukawa, M., Sakata, K. and Akagi, M. (2004). “An improved method based on the MTF concept for restoring the power envelope from a reverberant signal,” Acoust. Sci. & Tech., 25, 4, 232-242.
  8. Unoki, M., Toi, M., and Akagi, M. (2004). “A speech dereverberation method based on the MTF concept using adaptive time-frequency divisions,” Proc. EUSIPCO2004, 1689-1692.
  9. Unoki, M., Sakata, K., Toi, M., and Akagi, M. (2004). “Speech dereverberation based on the concept of the modulation transfer function,” Proc. NCSP2004, Hawaii, 423-426.
  10. Unoki, M., Sakata, K. and Akagi, M. (2003). “A speech dereverberation method based on the MTF concept,” Proc. EUROSPEECH2003, Geneva, 1417-1420.
  11. Unoki, M., Furukawa, M., Sakata, K., and Akagi, M. (2003). "A method based on the MTF concept for dereverberating the power envelope from the reverberant signal," Proc. ICASSP2003, Hong Kong, I-840-843.
  12. Unoki, M., Furukawa, M., and Akagi, M. (2002). "A method for recovering the power envelope from reverberant speech," SPA-Gen-002, Forum Acousticum Sevilla 2002.
2-4 Bone-conducted Speech
  1. Kinugasa, K., Unoki, M., and Akagi, M. (2009/03/01). "An MTF-based Blind Restoration Method for Improving Intelligibility of Bone-conducted Speech," Proc. NCSP'09, 105-108.
  2. Vu, T. T. Unoki, M. and Akagi, M. (2008/6/5). "An LP-based blind model for restoring bone-conducted speech," Proc. ICCE2008, 212-217.
  3. Vu, T. T., Unoki, M., and Akagi, M. (2008/3/7). "A study of blind model for restoring bone-conducted speech based on liner prediction scheme," Proc. NCSP08, 287-290.
  4. Vu, T. T. Unoki, M. and Akagi, M. (2007). “The Construction of Large-scale Bone-conducted and Air-conducted Speech Databases for Speech Intelligibility Tests,” Proc. Oriental COCOSDA2007, 88-91.
  5. Vu, T. T., Unoki, M., and Akagi, M. (2007). "A blind restoration model for bone-conducted speech based on a linear prediction scheme," Proc. NOLTA2007, Vancouver, 449-452.
  6. Vu, T. T., Seide, G., Unoki, M., and Akagi, M. (2007). "Method of LP-based blind restoration for improving intelligibility of bone-conducted speech," Proc. Interspeech2007, 966-969.
  7. Vu, T., Unoki, M., and Akagi, M. (2006). "A Study on Restoration of Bone-Conducted Speech with MTF-Based and LP-based Models," Journal of Signal Processing, 10, 6, 407-417.
  8. Vu, T., Unoki, M., and Akagi, M. (2006). "A study on an LP-based model for restoring bone-conducted speech," Proc. HUT-ICCE2006, Hanoi.
  9. Vu, T. T., Unoki, M., and Akagi, M. (2006). "A study on an LPC-based restoration model for improving the voice-quality of bone-conducted speech," Proc. NCSP2006, 110-113.
  10. Kimura, K., Unoki, M. and Akagi, M. (2005). “A study on a bone-conducted speech restoration method with the modulation filterbank,” Proc. NCSP05, Hawaii, 411-414.
2-5 Speech Recognition
  1. Lu, X., Unoki, M., and Akagi, M. (2008/11/1). “Comparative evaluation of modulation-transfer-function-based blind restoration of sub-band power envelopes of speech as a front-end processor for automatic speech recognition systems,” Acoustical Science and Technology, 29, 6, 351-361.
  2. Lu, X., Unoki, M., and Akagi, M. (2008/07/01). "An MTF-based blind restoration for temporal power envelopes as a front-end processor for automatic speech recognition systems in reverberant environments," Acoustics2008, Paris, 1419-1424.
  3. Haniu, A., Unoki, M., and Akagi, M. (2008/3/6). "A speech recognition method based on the selective sound segregation in various noisy environments," Proc. NCSP08, 168-171.
  4. Haniu, A., Unoki, M. and Akagi, M. (2007). " A study on a speech recognition method based on the selective sound segregation in various noisy environments," Proc. NOLTA2007, Vancouver, 445-448.
  5. Haniu, A., Unoki, M. and Akagi, M. (2007). "A study on a speech recognition method based on the selective sound segregation in noisy environment," Proc. JCA2007, CD-ROM.
  6. Lu, X., Unoki, M., and Akagi, M. (2006). "A robust feature extraction based on the MTF concept for speech recognition in reverberant environment," Proc. ICSLP2006, Pittsburgh, USA, 2546-2549.
  7. Lu, X., Unoki, M., and Akagi, M. (2006). "MTF-based sub-band power envelope restoration in reverberant environment for robust speech recognition, " Proc. NCSP2006, 162-165.
  8. Haniu, A., Unoki, M. and Akagi, M. (2005). “A study on a speech recognition method based on the selective sound segregation in noisy environment,” Proc. NCSP05, Hawaii, 403-406.
2-6 DOA

  • Nothing in English

3. Cocktail-party Effect Modeling

3-1 Sound Segregation
  1. Unoki, M., Kubo, M., Haniu, A., and Akagi, M. (2006). "A Model-Concept of the Selective Sound Segregation: — A Prototype Model for Selective Segregation of Target Instrument Sound from the Mixed Sound of Various Instruments —," Journal of Signal Processing, 10, 6, 419-431.
  2. Unoki, M., Kubo, M., Haniu, A., and Akagi, M. (2005). "A model for selective segregation of a target instrument sound from the mixed sound of various instruments," Proc. EuroSpeech2005, Lisbon, Portugal, 2097-2100.
  3. Unoki, M., Kubo, M., and Akagi, M. (2003). “A model for selective segregation of a target instrument sound from the mixed sound of various instruments,” Proc. ICMC2003, Singapore, 295-298.
  4. Akagi, M., Mizumachi, M., Ishimoto, Y., and Unoki, M. (2002). "Speech enhancement and segregation based on human auditory mechanisms", in Enabling Society with Information Technology, Q. Jin, J. Li, N. Zhang, J. Cheng, C. Yu, and S. Noguchi (Eds.), Springer Tokyo, 186-196
  5. Akagi, M., Mizumachi, M.,Ishimoto, Y., and Unoki, M. (2000). "Speech enhancement and segregation based on human auditory mechanisms", Proc. IS2000, Aizu, 246-253.
  6. Unoki, M. and Akagi, M. (1999). "Segregation of vowel in background noise using the model of segregating two acoustic sources based on auditory scene analysis", Proc. EUROSPEECH99, 2575-2578.
  7. Unoki, M. and Akagi, M. (1999). "Segregation of vowel in background noise using the model of segregating two acoustic sources based on auditory scene analysis", Proc. CASA99, IJCAI-99, Stockholm, 51-60.
  8. Akagi, M., Iwaki, M. and Minakawa, T. (1998). “Fundamental frequency fluctuation in continuous vowel utterance and its perception,” ICSLP98, Sydney, Vol.4, 1519-1522.
  9. Akagi, M., Iwaki, M. and Sakaguchi, N. (1998). “Spectral sequence compensation based on continuity of spectral sequence,” Proc. ICSLP98, Sydney, Vol.4, 1407-1410.
  10. Unoki, M. and Akagi, M. (1998). “Signal extraction from noisy signal based on auditory scene analysis,” ICSLP98, Sydney, Vol.5, 2115-2118.
  11. Unoki, M. and Akagi, M. (1998). “A method of signal extraction from noisy signal based on auditory scene analysis,” Speech Communication, 27, 3-4, 261-279.
  12. Unoki, M. and Akagi, M. (1998). “A method of signal extraction from noisy signal based on auditory scene analysis,” JAIST Tech. Report, IS-RR-98-0005P.
  13. Unoki, M. and Akagi, M. (1997). "A method of signal extraction from noisy signal", Proc. EUROSPEECH97, 2587-2590.
  14. Unoki, M. and Akagi, M. (1997). "A method of signal extraction from noisy signal based on auditory scene analysis", Proc. CASA97, IJCAI-97, Nagoya, 93-102.
  15. Unoki, M. and Akagi, M. (1997). “A method for signal extraction from noise-added signals”, Electronics and Communications in Japan, Part 3, 80, 11, 1-11.
3-2 Privacy Protection
  1. Tezuka, T. and Akagi, M. (2008/3/6). "Influence of spectrum envelope on phoneme perception," Proc. NCSP08, 176-179.
  2. Minowa A., Unoki M., and Akagi M. (2007). "A study on physical conditions for auditory segregation/integration of speech signals based on auditory scene analysis," Proc. NCSP2007, 313-316.
3-3 Noisy Sound Perception
  1. Kuroda, N., Li, J., Iwaya, Y., Unoki, M., and Akagi, M. (2009/03/01). "Effects from Spatial Cues on Detectability of Alarm Signals in Car Environments," Proc. NCSP'09, 45-48.
  2. Kusaba, M., Unoki, M., and Akagi, M. (2008/3/6). "A study on detectability of target signal in background noise by utilizing similarity of temporal envelopes in auditory search," Proc. NCSP08, 13-16.
  3. Uchiyama, H., Unoku, M., and Akagi, M. (2007). "Improvement in detectability of alarm signals in noisy environments by utilizing spatial cues," Proc. WASPAA2007, New Paltz, NY, pp.74-77.
  4. Uchiyama H., Unoki M., and Akagi M. (2007). "A study on perception of alarm signal in car environments," Proc. NCSP2007, 389-392.
  5. Nakanishi, J., Unoki, M., and Akagi, M. (2006). "Effect of ITD and component frequencies on perception of alarm signals in noisy environments," Journal of Signal Processing, 10, 4, 231-234.
  6. Nakanishi, J., Unoki, M., and Akagi, M. (2006). "Effect of ITD and component frequencies on perception of alarm signals in noisy environments," Proc. NCSP2006, 37-40.

4. Psychoacoustics

4-1 Auditory Model
  1. Unoki, M. and Akagi, M. (2001). "A computational model of co-modulation masking release," in Computational Models of Auditory Function, (Eds. Greenberg, S. and Slaney, M.), NATO ASI Series, IOS Press, Amsterdam, 221-232.
  2. Unoki, M. and Akagi, M. (1998). “A computational model of co-modulation masking release,” Computational Hearing, Italy, 129-134.
  3. Unoki, M. and Akagi, M. (1998). “A computational model of co-modulation masking release,” JAIST Tech. Report, IS-RR-98-0006P.
4-2 Contextual Effect
  1. Yonezawa, Y. and Akagi, M. (1996). "Modeling of contextual effects and its application to word spotting", Proc. Int. Conf. Spoken Lang. Process. 96, 2063-2066.
  2. Akagi, M., van Wieringen, A. and Pols, L. C. W. (1994). "Perception of central vowel with pre- and post-anchors", Proc. Int. Conf. Spoken Lang. Process. 94, 503-506.
4-3 Auditory Filter

  • Nothing in English

4-4 Phase Perception
  1. Akagi, M., and Nishizawa, M. (2001). "Detectability of phase change and its computational modeling," J. Acoust. Soc. Am., 110, 5, Pt. 2, 2680.
4-5 Vowel Perception

  • Nothing in English

4-6 Noise Evaluation
  1. Akagi, M., Kakehi, M., Kawaguchi, M., Nishinuma, M., and Ishigami, A. (2001). "Noisiness estimation of machine working noise using human auditory model", Proc. Internoise2001, 2451-2454.
  2. Mizumachi, M. and Akagi, M. (2000). "The auditory-oriented spectral distortion for evaluating speech signals distorted by additive noises," J. Acoust. Soc. Jpn. (E), 21, 5 251-258.
  3. Mizumachi, M. and Akagi, M. (1999). "An objective distortion estimator for hearing aids and its application to noise reduction," Proc. EUROSPEECH99, 2619-2622.

5. Physiological Auditory Modeling

  1. Ito, K. and Akagi, M. (2005). "Study on improving regularity of neural phase locking in single neurons of AVCN via a computational model," In Auditory Signal Processing, Springer, 91-99.
  2. Maki, K. and Akagi, M. (2005). "A computational model of cochlear nucleus neurons," In Auditory Signal Processing, Springer, 84-90.
  3. Ito, K. and Akagi, M. (2003). “Study on improving regularity of neural phase locking in single neuron of AVCN via computational model,” Proc. ISH2003, 77-83.
  4. Maki, K. and Akagi, M. (2003). “A computational model of cochlear nucleus neurons,” Proc. ISH2003, 70-76.
  5. Itoh, K. and Akagi, M. (2001). “A computational model of auditory sound localization,” in Computational Models of Auditory Function (Eds. Greenberg, S. and Slaney, M.), NATO ASI Series, IOS Press, Amsterdam, 97-111.
  6. Ito, K. and Akagi, M. (2000). "A computational model of binaural coincidence detection using impulses based on synchronization index." Proc, ISA2000 (BIS2000), Wollongong, Australia.
  7. Maki, K., Akagi, M. and Hirota, K. (2000). "Effect of the basilar membrane nonlinearities on rate-place representation of vowel in the cochlear nucleus: A modeling approach," In Recent Developments in Auditory Mechanics, World Scientific Publishing, 490-496.
  8. Ito, K. and Akagi, M. (2000). "A computational model of auditory sound localization based on ITD," In Recent Developments in Auditory Mechanics, World Scientific Publishing, 483-489.
  9. Ito, K. and Akagi, M. (2000). "A study on temporal information based on the synchronization index using a computational model," Proc. WESTPRAC7, 263-266.
  10. Maki, K., Akagi, M. and Hirota, K. (1999). "Effect of the basilar membrane nonlinearities on rate-place representation of vowel in the cochlear nucleus: A modeling approach," Abstracts of Symposium on Recent Developments in Auditory Mechanics, Sendai, Japan, 29P06, 166-167.
  11. Ito, K. and Akagi, M. (1999). "A computational model of auditory sound localization based on ITD," Abstracts of Symposium on Recent Developments in Auditory Mechanics, Sendai, Japan, 29P01, 156-157.
  12. Maki, K., Hirota, K. and Akagi, M. (1998). “A functional model of the auditory peripheral system: Responses to simple and complex stimuli,” Computational Hearing, Italy, 13-18.
  13. Itoh, K. and Akagi, M. (1998). “A computational model of auditory sound localization,” Computational Hearing, Italy, 67-72
  14. Maki, K. and Akagi, M. (1997). "A functional model of the auditory peripheral system", Proc. ASVA97, Tokyo, 703-710.

6. Abnormal Speech

6-1 Abnormal Speech Perception
  1. Kozaki-Yamaguchi, Y., Suzuki, N., Fujita, Y., Yoshimasu, H., Akagi, M., and Amagasa, T. (2005). "Perception of hypernasality and its physical correlates," Oral Science International, 2, 1, 21-35.
  2. Kozaki, Y., Suzuki, N., Amagasa, T., and Akagi, M. (2004). “Perception of hypernasality and its physical correlates,” Proc. ICA2004. 3313-3316.
  3. Akagi, M., Suzuki, N., Hayashi, K., Saito, H., and Michi, K. (2001). " Perception of Lateral Misarticulation and Its Physical Correlates", Folia Phoniatrica et Logopaedica, 53, 6, 291-307
  4. Akagi, M., Kitamura, T., Suzuki, N. and Michi, K. (1996). "Perception of lateral misarticulation and its physical correlates", Proc ASA-ASJ Joint Meeting, 933-936.
6-2 3D Vocal Tract Modeling
  1. Nishimoto, H. and Akagi, M. (2006). "Effects of complicated vocal tract shapes on vocal tract transfer functions," Journal of Signal Processing, 10, 4, 267-270.
  2. Nishimoto, H. and Akagi, M. (2006). "Effects of complicated vocal tract shapes on vocal tract transfer functions," Proc. NCSP2006, 114-117.
  3. Nishimoto, H., Akagi, M., Kitamura, T. and Suzuki, N. (2004). “Estimation of transfer function of vocal tract extracted from MRI data by FEM,” Proc. ICA2004, 1473-1476.
  4. Nishimoto, H., Akagi, M., Kitamura, T., Suzuki, N., and Saito, H. (2001). "FEM analysisof three-dimensional vocal tract models after tongue and mouth floor resection," J. Acoust. Soc. Am., 110, 5, Pt. 2, 2761.
  5. Nishimoto, H., Akagi, M., Kitamura, T., and Suzuki, N. (2002). "FEM analyses of three dimensional vocal tract models after tongue and mouth floor resection," NATO Advanced Study Institute 2002 Dynamics of Speech Production and Perception.

7. Interaction between Perception and Production

  1. Akagi, M., Dang, J., Lu, X., and Uchiyamada, T. (2006). "Investigation of interaction between speech perception and production using auditory feedback," J. Acoust. Soc. Am., 120, 5, Pt. 2, 3253.
  2. Dang, J., Akagi, M., and Honda, K. (2006). "Communication between speech production and perception within the brain - Observation and simulation," J. Comp. Sci. & Tech., 21, 1, 95-105.
  3. Matsuoka, R., Lu, X., Dang, J., and Akagi, M. (2004). “Investigation of interaction between speech perception and speech production,” Proc. KIT Int. Sympo. Brain and Language 2004, 27-28.

8. Others

8-1 NTT & ATR
  1. Akagi, M. (1993). "Modeling of contextual effects based on spectral peak interaction", J. of Acoust. Society of America, 93, 2, 1076-1086.
  2. Akagi, M. (1992). "Psychoacoustic evidence for contextual effect models", Speech Perception, Production and Linguistic Structure, IOS Press, Amsterdam, 63-78
  3. Akagi, M. (1990). "Contextual effect models and psychoacoustic evidence for the models", Proc. Int. Conf. Spoken Lang. Process. 90, 569-572.
  4. Akagi, M. and Tohkura, Y. (1990). "Spectrum target prediction model and its application to speech recognition", Computer Speech and Language, 4, Academic Press 325-344.
  5. Akagi, M. (1990). "Psychoacoustic evidence for a contextual effect model", J. Acoust. Soc. Am., Spl 1, 87, MMM2 (119th Meeting of ASA).
  6. Akagi, M. (1990). "Evaluation of a spectrum target prediction model in speech perception", J. of Acoust. Society of America, 87, 2, 858-865.
  7. Ueda, K. and Akagi, M. (1990). "Sharpness and amplitude envelopes of broadband noise", J. of Acoust. Society of America, 87, 2, 814-819.
  8. Akagi, M. (1989). "Modeling of contextual effect based on spectral peak interaction", J. Acoust. Soc. Am., Spl 1, 85, II8 (117th Meeting of ASA).
  9. Akagi, M. and Tohkura, Y. (1988). "On the application of spectrum target prediction model to speech recognition", Proc. Int. Conf. Acoustics Speech and Signal Process., New York, 139-142.
  10. Akagi, M. (1987). "Evaluation of a spectrum target prediction model in speech perception", J. Acoust. Soc. Am., Spl 1, 81, G8 (113th Meeting of ASA).
  11. Furui, S. and Akagi, M. (1985). "On the role of spectral transition in phoneme perception and its modeling", Proc. 12th Int. Conf. Acoustics, A2-6.
8-2 Tokyo Institute of Technology
  1. Akagi, M., and Iijima, T. (1984). "A construction of pole-deviation tracking filter," Electronics and Communications in Japan, 67-A, 5, 28-36.
  2. Akagi, M., and Iijima, T. (1982). "Speech Recognition by polarized linear predictive error coding –POLPEC method," Electronics and Communications in Japan, 65-A, 8, 9-18.


   All Rights Reserved, Copyright© Masato AKAGI, 1998-2009