|
|
| PAPERS (31/03/2009) |
1. Non-linguistic Information 
1-1 Singing Voice
1-2 Speaker Individuality
1-3 Emotional Speech
1-4 Voice Conversion
1-5 Speech Coding
2. Noise Reduction 
2-1 Microphone Array
2-2 F0 Extraction
2-3 De-reverberation
2-4 Bone-conducted Speech
2-5 Speech Recognition
2-6 DOA
3. Cocktail-party Effect Modeling 
3-1 Sound Segregation
3-2 Privacy Protection
3-3 Noisy Sound Perception
4. Psychoacoustics 
4-1 Auditory Model
4-2 Contextual Effect
4-3 Auditory Filter
4-4 Phase Perception
4-5 Vowel Perception
4-6 Noise Evaluation
5. Physiological Auditory Modeling 
6. Abnormal Speech 
6-1 Abnormal Speech Perception
6-2 3D Vocal Tract Modeling
7. Interaction between Perception and Production 
8. Others 
8-1 NTT & ATR
8-2 Tokyo Institute of Technology
|

If you are interested in my research topics and want to see my publications,
why don't you visit the JAIST Repositry and download my papers?

|
1. Non-linguistic Information
- Akagi, M. (2009/02/20). "Introduction of SCOPE project: Analysis of
production and perception characteristics of non-linguistic information
in speech and its application to inter-language communications," International
symposium on biomechanical and physiological modeling and speech science,
51-62.
1-1 Singing Voice
- Nakamura, T., Kitamura, T. and Akagi, M. (2009/03/01). "A study on
nonlinguistic feature in singing and speaking voices by brain activity
measurement," Proc. NCSP'09, 217-220.
- Saitou, T., Goto, M., Unoku, M., and Akagi, M. (2007). "Speech-to-singing
synthesis: converting speaking voices to singing voices by controlling
acoustic features unique to singing voices," Proc. WASPAA2007, New
Paltz, NY, pp.215-218
- Saitou, T., Goto, M., Unoki, M., and Akagi, M. (2007). "Vocal conversion
from speaking voice to singing voice using STRAIGHT," Proc. Interspeech2007,
Singing Challenge.
- Saitou, T., Unoki, M., and Akagi, M. (2006). "Analysis of acoustic
features affecting singing-voice perception and its application to singing-voice
synthesis from speaking-voice using STRAIGHT," J. Acoust. Soc. Am.,
120, 5, Pt. 2, 3029.
- Saitou, T., Unoki, M. and Akagi, M. (2005). "Development of an F0
control model based on F0 dynamic characteristics for singing-voice synthesis,"
Speech Communication 46, 405-417.
- Saitou, T., Tsuji, N., Unoki, M. and Akagi, M. (2004). “Analysis of acoustic
features affecting “singing-ness” and its application to singing-voice
synthesis from speaking-voice,” Proc. ICSLP2004, Cheju, Korea.
- Saitou, T., Unoki, M., and Akagi, M. (2004). “Control methods of acoustic
parameters for singing-voice synthesis,” Proc. ICA2004, 501-504.
- Saitou, T., Unoki, M., and Akagi, M. (2004). “Development of the F0 control
method for singing-voices synthesis,” Proc. SP2004, Nara, 491-494.
- Akagi, M. (2002). "Perception of fundamental frequency fluctuation,"
HEA-02-003-IP, Forum Acousticum Sevilla 2002 (Invited).
- Saitou, T., Unoki, M., and Akagi, M. (2002). "Extraction of F0 dynamic
characteristics and development of F0 control model in singing voice,"
Proc. ICAD2002, Kyoto.
- Unoki, M., Saitou, T., and Akagi, M. (2002). "Effect of F0 fluctuations
and development of F0 control model in singing voice perception,"
NATO Advanced Study Institute 2002 Dynamics of Speech Production and Perception.
- Akagi, M. and Kitakaze, H. (2000). "Perception of synthesized singing
voices with fine fluctuations in their fundamental frequency contours,"
Proc. ICSLP2000, Beijing, III-458-461.
1-2 Speaker Individuality
- Akagi, M. and Ienaga, T. (1997). "Speaker individuality in fundamental
frequency contours and its control", J. Acoust. Soc. Jpn. (E), 18,
2 73-80.
- Kitamura, T. and Akagi, M. (1996). "Relationship between physical
characteristics and speaker individualities in speech spectral envelopes",
Proc ASA-ASJ Joint Meeting, 833-838.
- Akagi, M. and Ienaga, T. (1995). "Speaker individualities in fundamental
frequency contours and its control", Proc. EUROSPEECH95, 439-442.
- Kitamura, T. and Akagi, M. (1995). "Speaker individualities in speech
spectral envelopes", J. Acoust. Soc. Jpn. (E), 16, 5, 283-289.
- Kitamura, T. and Akagi, M. (1994). "Speaker Individualities in speech
spectral envelopes", Proc. Int. Conf. Spoken Lang. Process. 94, 1183-1186.
1-3 Emotional Speech
- Aoki, Y., Huang, C-F., and Akagi, M. (2009/03/01). "An emotional speech
recognition system based on multi-layer emotional speech perception model,"
Proc. NCSP'09, 133-136.
- Huang, C. F. and Akagi, M. (2008/10) "A three-layered model for expressive
speech perception," Speech Communication 50, 810-828.
- Huang, C. F., Erickson, D., and Akagi, M. (2008/07/01). "Comparison
of Japanese expressive speech perception by Japanese and Taiwanese listeners,"
Acoustics2008, Paris, 2317-2322.
- Huang, C. F. and Akagi, M. (2007). "A rule-based speech morphing for
verifying an expressive speech perception model," Proc. Interspeech2007,
2661-2664.
- Sawamura K., Dang J., Akagi M., Erickson D., Li, A., Sakuraba, K., Minematsu,
N., and Hirose, K. (2007). "Common factors in emotion perception among
different cultures," Proc. ICPhS2007, 2113-2116.
- Huang, C. F. and Akagi, M. (2007). "The building and verification
of a three-layered model for expressive speech perception," Proc.
JCA2007, CD-ROM.
- Huang, C. F. and Akagi, M. (2005). "Toward a rule-based synthesis
of emotional speech on linguistic description of perception," Affective
Computing and Intelligent Interaction, Springer LNCS 3784, 366-373.
- Huang, C. F. and Akagi, M. (2005). "A Multi-Layer fuzzy logical model
for emotional speech Perception," Proc. EuroSpeech2005, Lisbon, Portugal,
417-420.
- Ito, S., Dang, J., and Akagi, M. (2004). “Investigation of the acoustic
features of emotional speech using physiological articulatory model,” Proc.
ICA2004, 2225-2226.
1-4 Voice Conversion
- Nguyen, B. P. and Akagi, M. (2009/02/20). "Applications of Temporal
Decomposition to Voice Transformation," International symposium on
biomechanical and physiological modeling and speech science, 19-24.
- Nguyen, B. P., Shibata, T., and Akagi, M. (2008/09/24). "High-quality
analysis/synthesis method based on Temporal decomposition for speech modification,"
Proc. InterSpeech2008, Brisbane, 662-665.
- Nguyen B. P. and Akagi M. (2008/6/6). "Phoneme-based spectral voice
conversion using temporal decomposition and Gaussian mixture model,"
Proc. ICCE2008, 224-229.
- Nguyen B. P. and Akagi M. (2008/3/7). "Control of spectral dynamics
using temporal decomposition in voice conversion and concatenative speech
synthesis," Proc. NCSP08, 279-282.
- Shibata, T. and Akagi, M. (2008/3/6). "A study on voice conversion
method for synthesizing stimuli to perform gender perception experiments
of speech," Proc. NCSP08, 180-183.
- Nguyen B. P. and Akagi M. (2007). "A flexible spectral modification
method based on temporal decomposition and Gaussian mixture model,"
Proc. Interspeech2007, 538-541.
- Nguyen B. P. and Akagi M. (2007). "Spectral Modification for Voice
Gender Conversion using Temporal Decomposition," Journal of Signal
Processing, 11, 4, 333-336.
- Akagi, M., Saitou, T., and Huang, C. F. (2007). "Voice conversion
to add non-linguistic information into speaking voices," Proc. JCA2007,
CD-ROM.
- Nguyen B. P. and Akagi M. (2007). "Spectral Modification for Voice
Gender Conversion using Temporal Decomposition," Proc. NCSP2007, 481-484.
- Takeyama, Y., Unoki, M., Akagi, M., and Kaminuma, A. (2006). "Synthesis
of mimic speech sounds uttered in noisy car environments," Proc. NCSP2006,
118-121.
1-5 Speech Analysis and Coding
- Tomoike, S. and Akagi, M. (2008). "Estimation of local peaks based
on particle filter in adverse environments," Journal of Signal Processing,
12, 4, 303-306.
- Tomoike, S. and Akagi, M. (2008/3/7). "Estimation of local peaks based
on particle filter in adverse environments," Proc. NCSP08, 391-394.
- Nguyen, P. C., Akagi, M., and Nguyen, P. B. (2007). "Limited error
based event localizing temporal decomposition and its application to variable-rate
speech coding," Speech Communication, 49, 292-304.
- Akagi, M., Nguyen, P. C., Saitou, T., Tsuji, N., and Unoki, M. (2004).
“Temporal decomposition of speech and its application to speech coding
and modification,” Proc. KEST2004, 280-288.
- Akagi, M. and Nguyen, P. C. (2004). “Temporal decomposition of speech and
its application to speech coding and modification,” Proc. Special Workshop
in MAUI (SWIM), 1-4, 2004.
- Nguyen, P. C. and Akagi, M. (2003). “Efficient quantization of speech excitation
parameters using temporal decomposition,” Proc. EUROSPEECH2003, Geneva,
449-452.
- Nguyen, P. C., Akagi, M., and Ho, T. B. (2003). "Temporal decomposition:
A promising approach to VQ-based speaker identification," Proc. ICME2003,
Baltimore, V.III, 617-620.
- Nguyen, P. C., Akagi, M., and Ho, T. B. (2003). "Temporal decomposition:
A promising approach to VQ-based speaker identification," Proc. ICASSP2003,
Hong Kong, I-184-187.
- Nguyen, P. C., Ochi, T., and Akagi, M. (2003). “Modified Restricted Temporal
Decomposition and its Application of Low Rate Speech Coding,” IEICE Trans.
Inf. & Syst., E86-D, 3, 397-405.
- Nguyen, P. C. and Akagi, M. (2002). "Variable rate speech coding using
STRAIGHT and temporal decomposition," Proc. SCW2002, Tsukuba, 26-28.
- Nguyen, P. C. and Akagi, M. (2002). "Coding speech at very low rates
using STRAIGHT and temporal decomposition," Proc. ICSLP2002, Denver,
1849-1852.
- Nguyen, P. C. and Akagi, M. (2002). "Limited error based event localizing
temporal decomposition," Proc. EUSIPCO2002, Toulouse, 190.
- Nguyen, P. C. and Akagi, M. (2002). "Improvement of the restricted
temporal decomposition method for line spectral frequency parameters,"
Proc. ICASSP2002, Orlando, I-265-268.
- Nandasena, A. C. R., Nguyen, P. C. and Akagi, M. (2001). " Spectral
stability based event localizing temporal decomposition", Computer
Speech & Language, Vol. 15, No. 4, 381-401
- Nandasena, A.C.R. and Akagi, M. (1998). “Spectral stability based event
localizing temporal decomposition,” Proc. ICASSP98, II, 957-960
2. Noise Reduction
2-1 Microphone Array
- Li, J., Jiang, H., and Akagi, M. (2008/09/23). "Psychoacoustically-motivated
adaptive β-order generalized spectral subtraction based on data-driven
optimization," Proc. InterSpeech2008, Brisbane, 171-174.
- Li, J., Sakamoto, S., Hongo, S., Akagi, M., and Suzuki, Y. (2008). “Adaptive
b-order generalized spectral subtraction for speech enhancement,” Signal
Processing, vol. 88, no. 11, pp. 2764-2776, 2008.
- Li, J., Sakamoto, S., Hongo, S., Akagi, M., and Suzuki, Y. (2008/08/16).
"Improved two-stage binaural speech enhancement based on accurate
interference estimation for hearing aids," IHCON2008
- Li, J., Sakamoto, S., Hongo, S., Akagi, M., and Suzuki, Y. (2008/0630).
"A two-stage binaural speech enhancement approach for hearing aids
with preserving binaural benefits in noisy environments," Acoustics2008,
Paris, 723-727.
- Li, J., Akagi, M., and Suzuki, Y. (2008). "A two-microphone noise
reduction method in highly non-stationary multiple-noise-source environments,"
IEICE Trans. Fundamentals, E91-A, 6, 1337-1346.
- Li, J. and Akagi, M. (2008). "A hybrid microphone array post-filter
in a diffuse noise field," Applied Acoustics 69, 546-557.
- Li, J., Sakamoto, S., Hongo, S., Akagi, M., and Suzuki, Y. (2007). "A
speech enhancement approach for binaural hearing aids," Proc. 22th
SIP Symposium, Sendai, 263-268.
- Li, J., Sakamoto, S., Hongo, S., Akagi, M., and Suzuki, Y. (2007). "Noise
reduction based on adaptive beta-order generalized spectral subtraction
for speech enhancement," Proc. Interspeech2007, 802-805.
- Li, J., Akagi, M., and Suzuki, Y. (2006). "Multi-channel noise reduction
in noisy environments," Chinese Spoken Language Processing, Proc.
ISCSLP2006, Springer LNCS 4274, 258-269.
- Li, J., Akagi, M., and Suzuki, Y. (2006). "Noise reduction based on
microphone array and post-filtering for robust speech recognition,"
Proc. ICSP, Guilin.
- Li, J. and Akagi, M. (2006). "Noise reduction method based on generalized
subtractive beamformer," Acoust. Sci. & Tech., 27, 4, 206-215.
- Li, J, Akagi, M., and Suzuki, Y. (2006). "Improved hybrid microphone
array post-filter by integrating a robust speech absence probability estimator
for speech enhancement," Proc. ICSLP2006, Pittsburgh, USA, 2130-2133.
- Li, J. and Akagi, M. (2006). "A noise reduction system based on hybrid
noise estimation technique and post-filtering in arbitrary noise environments,"
Speech Communication, 48, 111-126.
- Li, J., Akagi, M., and Suzuki, Y. (2006). "Noise reduction based on
generalized subtractive beamformer for speech enhancement," WESPAC2006,
Seoul
- Li, J. and Akagi, M. (2005). "Theoretical analysis of microphone arrays
with postfiltering for coherent and incoherent noise suppression in noisy
environments," Proc. IWAENC2005, Eindhoven, The Netherlands, 85-88.
- Li, J. and Akagi, M. (2005). "A hybrid microphone array post-filter
in a diffuse noise field," Proc. EuroSpeech2005, Lisbon, Portugal,
2313-2316.
- Li, J., Lu, X., and Akagi, M. (2005). "Noise reduction based on microphone
array and post-filtering for robust speech recognition in car environments,"
Proc. Workshop DSPinCar2005, S2-9
- Li, J., Lu, X., and Akagi, M. (2005). “A noise reduction system in arbitrary
noise environments and its application to speech enhancement and speech
recognition,” Proc. ICASSP2005, Philadelphia, III-277-280.
- Li, J. and Akagi, M. (2005). “Suppressing localized and non-localized noises
in arbitrary noise environments,” Proc. HSCMA2005, Piscataway.
- Li, J. and Akagi, M. (2004). “Noise reduction using hybrid noise estimation
technique and post-filtering,” Proc. ICSLP2004, Cheju, Korea.
- Akagi, M. and Kago, T. (2002). " Noise reduction using a small-scale
microphone array in multi noise source environment," Proc. ICASSP2002,
Orlando, I-909-912.
- Mizumachi, M., Akagi, M. and Nakamura, S. (2000). "Design of robust
subtractive beamformer for noisy speech recognition," Proc. ICSLP2000,
Beijing, IV-57-60.
- Mizumachi, M. and Akagi, M. (2000). "Noise reduction using a small-scale
microphone array under non-stationary signal conditions," Proc. WESTPRAC7,
421-424.
- Mizumachi, M. and Akagi, M. (1999). "Noise reduction method that is
equipped for robust direction finder in adverse environments," Proc.
Workshop on Robust Methods for Speech Recognition in Adverse Conditions,
Tampere, Finland, 179-182.
- Mizumachi, M. and Akagi, M. (1998). “Noise reduction by paired-microphones
using spectral subtraction,” Proc. ICASSP98, II, 1001-1004
- Akagi, M. and Mizumachi, M. (1997). "Noise Reduction by Paired Microphones",
Proc. EUROSPEECH97, 335-338.
2-2 F0 Extraction
- Ishimoto, Y., Akagi, M., Ishizuka, K., and Aikawa, K. (2004). “Fundamental
frequency estimation for noisy speech using entropy-weighted periodic and
harmonic features,” IEICE Trans. Inf. & Syst., E87-D, 1, 205-214.
- Ishimoto, Y., Unoki, M., and Akagi, M. (2001). "A fundamental frequency
estimation method for noisy speech based on instantaneous amplitude and
frequency", Proc. EUROSPEECH2001, Aalborg, 2439-2442.
- Ishimoto, Y., Unoki, M., and Akagi, M. (2001). "A fundamental frequency
estimation method for noisy speech based on instantaneous amplitude and
frequency ", Proc. CRAC, Aalborg.
- Ishimoto, Y., Unoki, M., and Akagi, M. (2001). "A fundamental frequency
estimation method for noisy speech based on periodicity and harmonicity",
Proc. ICASSP2001, SPEECH-SF3, Salt Lake City.
- Ishimoto, Y. and Akagi, M. (2000). "A fundamental frequency estimation
method for noisy speech," Proc. WESTPRAC7, 161-164.
2-3 De-reverberation
- Petric, R., Lu, X., Unoki, M., Akagi, M., and Hoffmann, R. (2008/09/24).
"Robust front end processing for speech recognition in reverberant
environments: Utilization of speech characteristics," Proc. InterSpeech2008,
Brisbane, 658-661.
- Uniki, M., Toi, M., Shibano, Y., and Akagi, M. (2006). "Suppression
of speech intelligibility loss through a modulation transfer function-based
speech dereverberation method," J. Acoust. Soc. Am., 120, 5, Pt. 2,
3360.
- Unoki, M., Toi, M., and Akagi, M. (2006). "Refinement of an MTF-based
speech dereverberation method using an optimal inverse-MTF filter,"
SPECOM2006, St. Petersburg, 323-326.
- Unoki, M., Toi, M., and Akagi, M. (2005). “Development of the MTF-based
speech dereverberation method using adaptive time-frequency division,”
Proc. Forum Acousticum 2005, 51-56.
- Toi, M., Unoki, M. and Akagi, M. (2005). “Development of adaptive time-frequency
divisions and a carrier reconstruction in the MTF-based speech dereverberation
method,” Proc. NCSP05, Hawaii, 355-358.
- Unoki, M., M., Sakata, Furukawa, K. and Akagi, M. (2004). “A speech dereverberation
method based on the MTF concept in power envelope restoration,” Acoust.
Sci. & Tech., 25, 4, 243-254.
- Unoki, M., Furukawa, M., Sakata, K. and Akagi, M. (2004). “An improved
method based on the MTF concept for restoring the power envelope from a
reverberant signal,” Acoust. Sci. & Tech., 25, 4, 232-242.
- Unoki, M., Toi, M., and Akagi, M. (2004). “A speech dereverberation method
based on the MTF concept using adaptive time-frequency divisions,” Proc.
EUSIPCO2004, 1689-1692.
- Unoki, M., Sakata, K., Toi, M., and Akagi, M. (2004). “Speech dereverberation
based on the concept of the modulation transfer function,” Proc. NCSP2004,
Hawaii, 423-426.
- Unoki, M., Sakata, K. and Akagi, M. (2003). “A speech dereverberation method
based on the MTF concept,” Proc. EUROSPEECH2003, Geneva, 1417-1420.
- Unoki, M., Furukawa, M., Sakata, K., and Akagi, M. (2003). "A method
based on the MTF concept for dereverberating the power envelope from the
reverberant signal," Proc. ICASSP2003, Hong Kong, I-840-843.
- Unoki, M., Furukawa, M., and Akagi, M. (2002). "A method for recovering
the power envelope from reverberant speech," SPA-Gen-002, Forum Acousticum
Sevilla 2002.
2-4 Bone-conducted Speech
- Kinugasa, K., Unoki, M., and Akagi, M. (2009/03/01). "An MTF-based
Blind Restoration Method for Improving Intelligibility of Bone-conducted
Speech," Proc. NCSP'09, 105-108.
- Vu, T. T. Unoki, M. and Akagi, M. (2008/6/5). "An LP-based blind model
for restoring bone-conducted speech," Proc. ICCE2008, 212-217.
- Vu, T. T., Unoki, M., and Akagi, M. (2008/3/7). "A study of blind
model for restoring bone-conducted speech based on liner prediction scheme,"
Proc. NCSP08, 287-290.
- Vu, T. T. Unoki, M. and Akagi, M. (2007). “The Construction of Large-scale
Bone-conducted and Air-conducted Speech Databases for Speech Intelligibility
Tests,” Proc. Oriental COCOSDA2007, 88-91.
- Vu, T. T., Unoki, M., and Akagi, M. (2007). "A blind restoration model
for bone-conducted speech based on a linear prediction scheme," Proc.
NOLTA2007, Vancouver, 449-452.
- Vu, T. T., Seide, G., Unoki, M., and Akagi, M. (2007). "Method of
LP-based blind restoration for improving intelligibility of bone-conducted
speech," Proc. Interspeech2007, 966-969.
- Vu, T., Unoki, M., and Akagi, M. (2006). "A Study on Restoration of
Bone-Conducted Speech with MTF-Based and LP-based Models," Journal
of Signal Processing, 10, 6, 407-417.
- Vu, T., Unoki, M., and Akagi, M. (2006). "A study on an LP-based model
for restoring bone-conducted speech," Proc. HUT-ICCE2006, Hanoi.
- Vu, T. T., Unoki, M., and Akagi, M. (2006). "A study on an LPC-based
restoration model for improving the voice-quality of bone-conducted speech,"
Proc. NCSP2006, 110-113.
- Kimura, K., Unoki, M. and Akagi, M. (2005). “A study on a bone-conducted
speech restoration method with the modulation filterbank,” Proc. NCSP05,
Hawaii, 411-414.
2-5 Speech Recognition
- Lu, X., Unoki, M., and Akagi, M. (2008/11/1). “Comparative evaluation of
modulation-transfer-function-based blind restoration of sub-band power
envelopes of speech as a front-end processor for automatic speech recognition
systems,” Acoustical Science and Technology, 29, 6, 351-361.
- Lu, X., Unoki, M., and Akagi, M. (2008/07/01). "An MTF-based blind
restoration for temporal power envelopes as a front-end processor for automatic
speech recognition systems in reverberant environments," Acoustics2008,
Paris, 1419-1424.
- Haniu, A., Unoki, M., and Akagi, M. (2008/3/6). "A speech recognition
method based on the selective sound segregation in various noisy environments,"
Proc. NCSP08, 168-171.
- Haniu, A., Unoki, M. and Akagi, M. (2007). " A study on a speech recognition
method based on the selective sound segregation in various noisy environments,"
Proc. NOLTA2007, Vancouver, 445-448.
- Haniu, A., Unoki, M. and Akagi, M. (2007). "A study on a speech recognition
method based on the selective sound segregation in noisy environment,"
Proc. JCA2007, CD-ROM.
- Lu, X., Unoki, M., and Akagi, M. (2006). "A robust feature extraction
based on the MTF concept for speech recognition in reverberant environment,"
Proc. ICSLP2006, Pittsburgh, USA, 2546-2549.
- Lu, X., Unoki, M., and Akagi, M. (2006). "MTF-based sub-band power
envelope restoration in reverberant environment for robust speech recognition,
" Proc. NCSP2006, 162-165.
- Haniu, A., Unoki, M. and Akagi, M. (2005). “A study on a speech recognition
method based on the selective sound segregation in noisy environment,”
Proc. NCSP05, Hawaii, 403-406.
2-6 DOA
3. Cocktail-party Effect Modeling
3-1 Sound Segregation
- Unoki, M., Kubo, M., Haniu, A., and Akagi, M. (2006). "A Model-Concept
of the Selective Sound Segregation: — A Prototype Model for Selective Segregation
of Target Instrument Sound from the Mixed Sound of Various Instruments
—," Journal of Signal Processing, 10, 6, 419-431.
- Unoki, M., Kubo, M., Haniu, A., and Akagi, M. (2005). "A model for
selective segregation of a target instrument sound from the mixed sound
of various instruments," Proc. EuroSpeech2005, Lisbon, Portugal, 2097-2100.
- Unoki, M., Kubo, M., and Akagi, M. (2003). “A model for selective segregation
of a target instrument sound from the mixed sound of various instruments,”
Proc. ICMC2003, Singapore, 295-298.
- Akagi, M., Mizumachi, M., Ishimoto, Y., and Unoki, M. (2002). "Speech
enhancement and segregation based on human auditory mechanisms", in
Enabling Society with Information Technology, Q. Jin, J. Li, N. Zhang,
J. Cheng, C. Yu, and S. Noguchi (Eds.), Springer Tokyo, 186-196
- Akagi, M., Mizumachi, M.,Ishimoto, Y., and Unoki, M. (2000). "Speech
enhancement and segregation based on human auditory mechanisms", Proc.
IS2000, Aizu, 246-253.
- Unoki, M. and Akagi, M. (1999). "Segregation of vowel in background
noise using the model of segregating two acoustic sources based on auditory
scene analysis", Proc. EUROSPEECH99, 2575-2578.
- Unoki, M. and Akagi, M. (1999). "Segregation of vowel in background
noise using the model of segregating two acoustic sources based on auditory
scene analysis", Proc. CASA99, IJCAI-99, Stockholm, 51-60.
- Akagi, M., Iwaki, M. and Minakawa, T. (1998). “Fundamental frequency fluctuation
in continuous vowel utterance and its perception,” ICSLP98, Sydney, Vol.4,
1519-1522.
- Akagi, M., Iwaki, M. and Sakaguchi, N. (1998). “Spectral sequence compensation
based on continuity of spectral sequence,” Proc. ICSLP98, Sydney, Vol.4,
1407-1410.
- Unoki, M. and Akagi, M. (1998). “Signal extraction from noisy signal based
on auditory scene analysis,” ICSLP98, Sydney, Vol.5, 2115-2118.
- Unoki, M. and Akagi, M. (1998). “A method of signal extraction from noisy
signal based on auditory scene analysis,” Speech Communication, 27, 3-4,
261-279.
- Unoki, M. and Akagi, M. (1998). “A method of signal extraction from noisy
signal based on auditory scene analysis,” JAIST Tech. Report, IS-RR-98-0005P.
- Unoki, M. and Akagi, M. (1997). "A method of signal extraction from
noisy signal", Proc. EUROSPEECH97, 2587-2590.
- Unoki, M. and Akagi, M. (1997). "A method of signal extraction from
noisy signal based on auditory scene analysis", Proc. CASA97, IJCAI-97,
Nagoya, 93-102.
- Unoki, M. and Akagi, M. (1997). “A method for signal extraction from noise-added
signals”, Electronics and Communications in Japan, Part 3, 80, 11, 1-11.
3-2 Privacy Protection
- Tezuka, T. and Akagi, M. (2008/3/6). "Influence of spectrum envelope
on phoneme perception," Proc. NCSP08, 176-179.
- Minowa A., Unoki M., and Akagi M. (2007). "A study on physical conditions
for auditory segregation/integration of speech signals based on auditory
scene analysis," Proc. NCSP2007, 313-316.
3-3 Noisy Sound Perception
- Kuroda, N., Li, J., Iwaya, Y., Unoki, M., and Akagi, M. (2009/03/01). "Effects
from Spatial Cues on Detectability of Alarm Signals in Car Environments,"
Proc. NCSP'09, 45-48.
- Kusaba, M., Unoki, M., and Akagi, M. (2008/3/6). "A study on detectability
of target signal in background noise by utilizing similarity of temporal
envelopes in auditory search," Proc. NCSP08, 13-16.
- Uchiyama, H., Unoku, M., and Akagi, M. (2007). "Improvement in detectability
of alarm signals in noisy environments by utilizing spatial cues,"
Proc. WASPAA2007, New Paltz, NY, pp.74-77.
- Uchiyama H., Unoki M., and Akagi M. (2007). "A study on perception
of alarm signal in car environments," Proc. NCSP2007, 389-392.
- Nakanishi, J., Unoki, M., and Akagi, M. (2006). "Effect of ITD and
component frequencies on perception of alarm signals in noisy environments,"
Journal of Signal Processing, 10, 4, 231-234.
- Nakanishi, J., Unoki, M., and Akagi, M. (2006). "Effect of ITD and
component frequencies on perception of alarm signals in noisy environments,"
Proc. NCSP2006, 37-40.
4. Psychoacoustics
4-1 Auditory Model
- Unoki, M. and Akagi, M. (2001). "A computational model of co-modulation masking release," in Computational Models of Auditory Function, (Eds. Greenberg, S. and Slaney, M.), NATO ASI Series, IOS Press, Amsterdam, 221-232.
- Unoki, M. and Akagi, M. (1998). “A computational model of co-modulation masking release,” Computational Hearing, Italy, 129-134.
- Unoki, M. and Akagi, M. (1998). “A computational model of co-modulation
masking release,” JAIST Tech. Report, IS-RR-98-0006P.
4-2 Contextual Effect
- Yonezawa, Y. and Akagi, M. (1996). "Modeling of contextual effects and its application to word spotting", Proc. Int. Conf. Spoken Lang. Process. 96, 2063-2066.
- Akagi, M., van Wieringen, A. and Pols, L. C. W. (1994). "Perception
of central vowel with pre- and post-anchors", Proc. Int. Conf. Spoken
Lang. Process. 94, 503-506.
4-3 Auditory Filter
4-4 Phase Perception
- Akagi, M., and Nishizawa, M. (2001). "Detectability of phase change
and its computational modeling," J. Acoust. Soc. Am., 110, 5, Pt.
2, 2680.
4-5 Vowel Perception
4-6 Noise Evaluation
- Akagi, M., Kakehi, M., Kawaguchi, M., Nishinuma, M., and Ishigami, A. (2001).
"Noisiness estimation of machine working noise using human auditory
model", Proc. Internoise2001, 2451-2454.
- Mizumachi, M. and Akagi, M. (2000). "The auditory-oriented spectral
distortion for evaluating speech signals distorted by additive noises,"
J. Acoust. Soc. Jpn. (E), 21, 5 251-258.
- Mizumachi, M. and Akagi, M. (1999). "An objective distortion estimator
for hearing aids and its application to noise reduction," Proc. EUROSPEECH99,
2619-2622.
5. Physiological Auditory Modeling
- Ito, K. and Akagi, M. (2005). "Study on improving regularity of neural
phase locking in single neurons of AVCN via a computational model,"
In Auditory Signal Processing, Springer, 91-99.
- Maki, K. and Akagi, M. (2005). "A computational model of cochlear
nucleus neurons," In Auditory Signal Processing, Springer, 84-90.
- Ito, K. and Akagi, M. (2003). “Study on improving regularity of neural
phase locking in single neuron of AVCN via computational model,” Proc.
ISH2003, 77-83.
- Maki, K. and Akagi, M. (2003). “A computational model of cochlear nucleus
neurons,” Proc. ISH2003, 70-76.
- Itoh, K. and Akagi, M. (2001). “A computational model of auditory sound
localization,” in Computational Models of Auditory Function (Eds. Greenberg,
S. and Slaney, M.), NATO ASI Series, IOS Press, Amsterdam, 97-111.
- Ito, K. and Akagi, M. (2000). "A computational model of binaural coincidence
detection using impulses based on synchronization index." Proc, ISA2000
(BIS2000), Wollongong, Australia.
- Maki, K., Akagi, M. and Hirota, K. (2000). "Effect of the basilar
membrane nonlinearities on rate-place representation of vowel in the cochlear
nucleus: A modeling approach," In Recent Developments in Auditory
Mechanics, World Scientific Publishing, 490-496.
- Ito, K. and Akagi, M. (2000). "A computational model of auditory sound
localization based on ITD," In Recent Developments in Auditory Mechanics,
World Scientific Publishing, 483-489.
- Ito, K. and Akagi, M. (2000). "A study on temporal information based
on the synchronization index using a computational model," Proc. WESTPRAC7,
263-266.
- Maki, K., Akagi, M. and Hirota, K. (1999). "Effect of the basilar
membrane nonlinearities on rate-place representation of vowel in the cochlear
nucleus: A modeling approach," Abstracts of Symposium on Recent Developments
in Auditory Mechanics, Sendai, Japan, 29P06, 166-167.
- Ito, K. and Akagi, M. (1999). "A computational model of auditory sound
localization based on ITD," Abstracts of Symposium on Recent Developments
in Auditory Mechanics, Sendai, Japan, 29P01, 156-157.
- Maki, K., Hirota, K. and Akagi, M. (1998). “A functional model of the auditory
peripheral system: Responses to simple and complex stimuli,” Computational
Hearing, Italy, 13-18.
- Itoh, K. and Akagi, M. (1998). “A computational model of auditory sound
localization,” Computational Hearing, Italy, 67-72
- Maki, K. and Akagi, M. (1997). "A functional model of the auditory
peripheral system", Proc. ASVA97, Tokyo, 703-710.
6. Abnormal Speech
6-1 Abnormal Speech Perception
- Kozaki-Yamaguchi, Y., Suzuki, N., Fujita, Y., Yoshimasu, H., Akagi, M.,
and Amagasa, T. (2005). "Perception of hypernasality and its physical
correlates," Oral Science International, 2, 1, 21-35.
- Kozaki, Y., Suzuki, N., Amagasa, T., and Akagi, M. (2004). “Perception
of hypernasality and its physical correlates,” Proc. ICA2004. 3313-3316.
- Akagi, M., Suzuki, N., Hayashi, K., Saito, H., and Michi, K. (2001). "
Perception of Lateral Misarticulation and Its Physical Correlates",
Folia Phoniatrica et Logopaedica, 53, 6, 291-307
- Akagi, M., Kitamura, T., Suzuki, N. and Michi, K. (1996). "Perception
of lateral misarticulation and its physical correlates", Proc ASA-ASJ
Joint Meeting, 933-936.
6-2 3D Vocal Tract Modeling
- Nishimoto, H. and Akagi, M. (2006). "Effects of complicated vocal
tract shapes on vocal tract transfer functions," Journal of Signal
Processing, 10, 4, 267-270.
- Nishimoto, H. and Akagi, M. (2006). "Effects of complicated vocal
tract shapes on vocal tract transfer functions," Proc. NCSP2006, 114-117.
- Nishimoto, H., Akagi, M., Kitamura, T. and Suzuki, N. (2004). “Estimation
of transfer function of vocal tract extracted from MRI data by FEM,” Proc.
ICA2004, 1473-1476.
- Nishimoto, H., Akagi, M., Kitamura, T., Suzuki, N., and Saito, H. (2001).
"FEM analysisof three-dimensional vocal tract models after tongue
and mouth floor resection," J. Acoust. Soc. Am., 110, 5, Pt. 2, 2761.
- Nishimoto, H., Akagi, M., Kitamura, T., and Suzuki, N. (2002). "FEM
analyses of three dimensional vocal tract models after tongue and mouth
floor resection," NATO Advanced Study Institute 2002 Dynamics of Speech
Production and Perception.
7. Interaction between Perception and Production
- Akagi, M., Dang, J., Lu, X., and Uchiyamada, T. (2006). "Investigation
of interaction between speech perception and production using auditory
feedback," J. Acoust. Soc. Am., 120, 5, Pt. 2, 3253.
- Dang, J., Akagi, M., and Honda, K. (2006). "Communication between
speech production and perception within the brain - Observation and simulation,"
J. Comp. Sci. & Tech., 21, 1, 95-105.
- Matsuoka, R., Lu, X., Dang, J., and Akagi, M. (2004). “Investigation of
interaction between speech perception and speech production,” Proc. KIT
Int. Sympo. Brain and Language 2004, 27-28.
8. Others
8-1 NTT & ATR
- Akagi, M. (1993). "Modeling of contextual effects based on spectral
peak interaction", J. of Acoust. Society of America, 93, 2, 1076-1086.
- Akagi, M. (1992). "Psychoacoustic evidence for contextual effect models",
Speech Perception, Production and Linguistic Structure, IOS Press, Amsterdam,
63-78
- Akagi, M. (1990). "Contextual effect models and psychoacoustic evidence
for the models", Proc. Int. Conf. Spoken Lang. Process. 90, 569-572.
- Akagi, M. and Tohkura, Y. (1990). "Spectrum target prediction model
and its application to speech recognition", Computer Speech and Language,
4, Academic Press 325-344.
- Akagi, M. (1990). "Psychoacoustic evidence for a contextual effect
model", J. Acoust. Soc. Am., Spl 1, 87, MMM2 (119th Meeting of ASA).
- Akagi, M. (1990). "Evaluation of a spectrum target prediction model
in speech perception", J. of Acoust. Society of America, 87, 2, 858-865.
- Ueda, K. and Akagi, M. (1990). "Sharpness and amplitude envelopes
of broadband noise", J. of Acoust. Society of America, 87, 2, 814-819.
- Akagi, M. (1989). "Modeling of contextual effect based on spectral
peak interaction", J. Acoust. Soc. Am., Spl 1, 85, II8 (117th Meeting
of ASA).
- Akagi, M. and Tohkura, Y. (1988). "On the application of spectrum
target prediction model to speech recognition", Proc. Int. Conf. Acoustics
Speech and Signal Process., New York, 139-142.
- Akagi, M. (1987). "Evaluation of a spectrum target prediction model
in speech perception", J. Acoust. Soc. Am., Spl 1, 81, G8 (113th Meeting
of ASA).
- Furui, S. and Akagi, M. (1985). "On the role of spectral transition
in phoneme perception and its modeling", Proc. 12th Int. Conf. Acoustics,
A2-6.
8-2 Tokyo Institute of Technology
- Akagi, M., and Iijima, T. (1984). "A construction of pole-deviation
tracking filter," Electronics and Communications in Japan, 67-A, 5,
28-36.
- Akagi, M., and Iijima, T. (1982). "Speech Recognition by polarized
linear predictive error coding –POLPEC method," Electronics and Communications
in Japan, 65-A, 8, 9-18.
|
|
|