Project Description

Project Description - Robust Speech Recognition

Robust Speech Recognition by Noise Immunization

Developed a robust speech recognition system using neural network. Using the multi-layer perceptron (MLP) neural network as a robust classifier and a modified backpropagation training algorithm, noise immunity is achieved for SNR levels down to 5 dB while maintaining high recognition accuracy. The use of noise immunization technique and the correlation of performance with the order of data presentation for network training are studied.

A word spotting system is developed to recognize the keyword 'collect' corrupted by white Gaussian noise in continuous speech.

Robust Recognition for Mobile Communication Applications

Developed robust speech recognition techniques for voice activated dialing for cellular phone in a car. Noise reduction techniques including linear and nonlinear spectral subtraction are implemented before modeling the speech parameters in the homomorphic domain. The FFT derived cepstral coefficients are liftered and then Mel-scale warped to generate the feature vector. Radial Basis Function (RBF) neural Network is used as the final classifier and its real-time performance is evaluated. Performance evaluation is carried out for both speaker-dependent and speaker-independent recognition across -5 dB to 25 dB SNR range with different front-end processing. The system is evaluated using NOISEX-92 and TIDIGITS noise and speech databases.

Speech Processing and Recognition Algorithms

Developed algorithms for pitch detection, endpoint detection, feature extraction, and several other processes involved in speech processing and recognition.

For Details See the Relevant Publications

R. Sankar and S. Patravali, Noise Immunization Using Neural Net for Speech Recognition, Proc. IEEE International Conf. on Acoustics, Speech and Signal Processing (ICASSP), Adelaide, Australia, April 1994, Vol. II, pp. 685-688.

S. Patravali, Robust Speech Recognition Using Neural Networks, M.S. Thesis, August 1995.

R. Sankar and S. Patravali, Robust Speech Recognition by Noise Immunization Using Neural Network, Proc. Artificial Neural Networks in Engineering (ANNIE '93), St. Louis, MO, November 1993.

H. Ruan and R. Sankar, Applying Neural Network to Robust Keyword Spotting in Speech Recognition Application, International Conference on Neural Networks (ICNN), Perth, Australia, November 1995, pp. 2882-2886.

H. Ruan, Applying Neural Network to Robust Keyword Spotting in Speech Recognition Application, M.S. Project, April 1995.

R. Sankar and N. Sethi, Robust Speech Recognition Techniques Using a Radial Basis Function Neural Network for Mobile Applications, IEEE Southeastcon '97, Blacksburg, VA, April 1997, pp. 87-91.

N. Sethi, Evaluation of Noise Reduction Techniques and RBF Neural Network for Robust Speech Recognition in Mobile Applications, M.S. Thesis, December 1996.

S. Varada and R. Sankar, Hardware Strategies for End Point Detection, Proc. IEEE Southcon, Ft. Lauderdale, FL, March 1995, pp. 163-167, (also accepted for The Fifth International Conf. on Signal Processing Applications & Technology - ICSPAT, Dallas, TX, October 1994).

V. K. Sundaresan, S. Nichani, N. Ranganathan, and R. Sankar, A VLSI Hardware Accelerator for Dynamic Time Warping, Proc. 11th IAPR International Conf. on Pattern Recognition, The Hague, The Netherlands, August/September 1992, Vol. IV, pp. 27-30.

V. K. Sundaresan, Software and Hardware Solutions for Dynamic Time Warping Algorithm, M.S. Thesis, December 1991.

R. Sankar, Implementation of an Experimental Speaker-Independent Discrete Utterance Recognition System, Proc. IEEE International Conf. on Signal Processing, Beijing, China, October 1990, pp.445-448.

M. E. Thompson, PC-Based Isolated Word Recognition System, M.S. Thesis, July 1990.

R. F. Lorenzoni, Software Implementation of an LPC Based Isolated Word Recognition System, M.S. Thesis, July 1989.

R. Sankar, A Pitch Extraction Algorithm for Voice Recognition Applications, Proc. 20th Southeastern Symposium on System Theory, Charlotte, NC, March 1988, pp. 384-387.

R. Sankar, Experimental Evaluation of Structural Features for a Speaker-Independent Voice Recognition System, Proc. 20th Southeastern Symposium on System Theory, Charlotte, NC, March 1988, pp. 378-382.