Scale-invariant MFCCs for speech/speaker recognition

Tufekci, Zekeriya; Disken, Gokay

Scale-invariant MFCCs for speech/speaker recognition

Tarih

2019

Yazarlar

Tufekci, Zekeriya

Disken, Gokay

Yayıncı

Tubitak Scientific & Technological Research Council Turkey

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

The feature extraction process is a fundamental part of speech processing. Mel frequency cepstral coefficients (MFCCs) are the most commonly used feature types in the speech/speaker recognition literature. However, the MFCC framework may face numerical issues or dynamic range problems, which decreases their performance. A practical solution to these problems is adding a constant to filter-bank magnitudes before log compression, thus violating the scale-invariant property. In this work, a magnitude normalization and a multiplication constant are introduced to make the MFCCs scale-invariant and to avoid dynamic range expansion of nonspeech frames. Speaker verification experiments are conducted to show the effectiveness of the proposed scheme.

Anahtar Kelimeler

Feature extraction, speaker recognition, speech recognition

Kaynak

Turkish Journal of Electrical Engineering and Computer Sciences

WoS Q Değeri

Q4

Scopus Q Değeri

Q2

Cilt

27

Sayı

5

Bağlantı

https://doi.org/10.3906/elk-1901-231
https://search.trdizin.gov.tr/tr/yayin/detay/337504
https://hdl.handle.net/20.500.14669/2388

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu
TR-Dizin İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

Scale-invariant MFCCs for speech/speaker recognition

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Özet

Açıklama

Anahtar Kelimeler

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Bağlantı

Koleksiyon