Scale-invariant MFCCs for speech/speaker recognition
[ X ]
Tarih
2019
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Tubitak Scientific & Technological Research Council Turkey
Erişim Hakkı
info:eu-repo/semantics/openAccess
Özet
The feature extraction process is a fundamental part of speech processing. Mel frequency cepstral coefficients (MFCCs) are the most commonly used feature types in the speech/speaker recognition literature. However, the MFCC framework may face numerical issues or dynamic range problems, which decreases their performance. A practical solution to these problems is adding a constant to filter-bank magnitudes before log compression, thus violating the scale-invariant property. In this work, a magnitude normalization and a multiplication constant are introduced to make the MFCCs scale-invariant and to avoid dynamic range expansion of nonspeech frames. Speaker verification experiments are conducted to show the effectiveness of the proposed scheme.
Açıklama
Anahtar Kelimeler
Feature extraction, speaker recognition, speech recognition
Kaynak
Turkish Journal of Electrical Engineering and Computer Sciences
WoS Q Değeri
Q4
Scopus Q Değeri
Q2
Cilt
27
Sayı
5