Automatic Non-native Dialect and Accent Voice Detection of South Indian English
Speech recognition has achieved enormous improvements presently. However, robustness is still one of the big tribulations, e.g. performance of recognition fluctuates penetratingly depending on the speaker, particularly as the speaker has robust accent that is not coated in the training corpus. The speaker variability, such like gender, accent, age, speaking rate, and phone realizations, are vital problems in speech recognition. The mainly South Indian accent identification is a recent challenging problem closely related to other relatively recent fields of the multilinguality area like non native speech identification and language identification. This paper explains an automatic recognition system for English accents from 5 different South Indian State. The approach is based on a corresponding set of random nets with situation independent HMM units. The random topology was in addition substituted by pronunciation transcription constraints so as to integrate accent specific automatic word recognizers.
(1) Varga A.P. and Moore R.K., “Hidden Markov Model decomposition of speech and noise,” Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, pp. 845-48, 1990.
(2) Yunxin Zhao, “Maximum likelihood joint estimation of channel and noise for robust speech recognition” Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference Volume 2, 5-9 June
(3) Yunxin Zhao; Shaojun Wang; Kuan-Chieh Yen “Recursive estimation of time- varying environments for robust speech recognition” Acoustics, Speech, and Signal Processing, 2001.
(4) Rabiner and B.H. Juang, “Fundamentals of Speech Recognition”, Prentice Hall, Englewood Cliffs, NJ, 1993
(5) sadaoki- Furui “ Digital Speech Processing , synthesis and Recognition “
(6) Donglai Zhu; Paliwal, K.K. “Product of power spectrum and group delay function for speech recognition” Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference Volume 1, 17-21 May
(7) H.A. Murthy and V. Gadde, “The Modified Group Delay Function and Its Application to Phoneme Recognition”, Proc. ICASSP, vol. 1, pp. 68-71, 2003
(8) B. Yegnanarayana and H.A. Murthy, “Significance of Group Delay Functions in Spectrum Estimation”, IEEE Trans. Signal Processing, vol. 40, pp. 2281-2289, 1992
(9) Jen-Tzung Chien; Chih-Hsien Huang, “Bayesian duration modeling and learning for speech recognition” Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference Volume 1, 17-21 May 2004
(10) Ramirez, J.; Segura, J.C.; Benirez, C.; de la Torre, A.; Rubio, A. “A new voice activity detector using sub band order-statistics filters for robust speech recognition “Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference Volume 1, 17-21 May 2004 Page(s):I - 849-52 vol.1
(11) K. Woo, T. Yang, K. Park, and C. Lee, “Robust voice activity detection algorithm for estimating noise spectrum,” Electronics Letters, vol.36, no. 2, pp. 180–181, 2000.
(12) Smith, N.D.; Gales, M.J.F. “Using SVMs and discriminative models for speech recognition” Acoustics, Speech, and Signal Processing, 2002. Proceedings.
(13) N.Smith and M. Gales ,” Speech Recognition using SVMs”, in Advances in Neural Information Processing Systems, T.G. Dietterich , S.Becker , and Z. Ghahramani, Eds.,vol. 14. MIT
(14) C. Huang, T. Chen, S. Li, E. Chang and J.L. Zhou, “Analysis of Speaker Variability,” in Proc. Eurospeech’2001, vol.2, pp.1377-1380, 2001.
(15) K. Berkling, M. Zissman, J. Vonwiller and C. Cleirigh, “Improving Accent Identification Through Knowledge of English Syllable Structure,” in Proc. ICSLP’98, vol.2, pp. 89-92, 1998.
(16) C. Teixeira, I. Trancoso and A. Serralheiro, “Accent Identification,” in Proc. ICSLP’96, vol.3, pp. 1784-1787, 1996.