Instantaneous Frequency Features for Noise Robust Speech Recognition

Nayak, Shekhar and Shashank, Dhar B. and Kodukula, Sri Rama Murty and et al, . (2019) Instantaneous Frequency Features for Noise Robust Speech Recognition. In: 25th National Conference on Communications, NCC, 20 - 23 February 2019, Bangalore, India.

Full text not available from this repository. (Request a copy)


Analytic phase of the speech signal plays an important role in human speech perception, specially in the presence of noise. Generally, phase information is ignored in most of the recent speech recognition systems. In this paper, we illustrate the importance of analytic phase of the speech signal for noise robust automatic speech recognition. To avoid phase wrapping problem involved in the computation of analytic phase, features are extracted from instantaneous frequency (IF) which is time derivative of analytic phase. Deep neural network (DNN) based acoustic models are trained on clean speech using features extracted from the IF of speech signals. Robustness of IF features in combination with mel-frequency cepstral coefficients (MFCCs) was evaluated in varied noisy conditions. System combination using minimum Bayes risk decoding of IF features with MFCCs delivered absolute improvements of upto 13% over MFCC features alone for DNN based systems under noisy conditions. The impact of the system combination of magnitude and phase based features on different phonetic classes was studied under noisy conditions and was found to model both voiced and unvoiced phonetic classes efficiently.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Kodukula, Sri Rama Murty
Item Type: Conference or Workshop Item (Paper)
Subjects: Electrical Engineering
Divisions: Department of Electrical Engineering
Depositing User: Team Library
Date Deposited: 08 Jul 2019 09:21
Last Modified: 08 Jul 2019 09:21
Publisher URL:
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 5655 Statistics for this ePrint Item