Evolvement And Recent Research In Parametric Representations Of Speech Features For Automatic Speaker Recognition
Speech feature extraction is the most important step in any Automatic speaker recognition system. In the last 60
years a lot of research has gone into parametric representation of these speech features. Several techniques have evolved one
after the other in order to defeat the shortcomings of the previous one. Yet Automatic Speaker Recognition still remains a
challenge mainly due to variations in speaker’s vocal tract with time and health, varying environmental conditions,
variations in the behavior and quality of speech recorders etc. Although Mel Frequency Cepstral Coefficients (MFCC) has
become a standard for speaker recognition, the conventional MFCC has a poor recognition in presence of noise. In this paper
MFCC technique was used for Automatic Speaker recognition in case of a slightly noisy environment. In this experiment a
VQ codebook was created by clustering the training features of 9 speakers. This data was stored in aspeaker database. Here
the K means algorithm was used for clustering. A distortion measure based on the minimum Euclidean distance was used for
speaker recognition. The failure rate of speaker recognition was found to be 20%. Matlab-7.10.0 was used for this study.
This paper also presents an overview of the techniques that have been used for parametric representation of speech features
and the modifications that took place to enhance their capabilities. The paper also discusses the most modern techniques,
latest research and various modifications that have been proposed in order to enhance the competence of an Automatic
Speaker Recognition System as compared to other Biometric systems.