Scale-invariant MFCCs for speech/speaker recognition

Tufekci, Zekeriya; Disken, Gokay

Scale-invariant MFCCs for speech/speaker recognition

dc.contributor.author	Tufekci, Zekeriya
dc.contributor.author	Disken, Gokay
dc.date.accessioned	2025-01-06T17:37:50Z
dc.date.available	2025-01-06T17:37:50Z
dc.date.issued	2019
dc.description.abstract	The feature extraction process is a fundamental part of speech processing. Mel frequency cepstral coefficients (MFCCs) are the most commonly used feature types in the speech/speaker recognition literature. However, the MFCC framework may face numerical issues or dynamic range problems, which decreases their performance. A practical solution to these problems is adding a constant to filter-bank magnitudes before log compression, thus violating the scale-invariant property. In this work, a magnitude normalization and a multiplication constant are introduced to make the MFCCs scale-invariant and to avoid dynamic range expansion of nonspeech frames. Speaker verification experiments are conducted to show the effectiveness of the proposed scheme.
dc.identifier.doi	10.3906/elk-1901-231
dc.identifier.endpage	3762
dc.identifier.issn	1300-0632
dc.identifier.issn	1303-6203
dc.identifier.issue	5
dc.identifier.scopus	2-s2.0-85072597726
dc.identifier.scopusquality	Q2
dc.identifier.startpage	3758
dc.identifier.trdizinid	337504
dc.identifier.uri	https://doi.org/10.3906/elk-1901-231
dc.identifier.uri	https://search.trdizin.gov.tr/tr/yayin/detay/337504
dc.identifier.uri	https://hdl.handle.net/20.500.14669/2388
dc.identifier.volume	27
dc.identifier.wos	WOS:000486425400034
dc.identifier.wosquality	Q4
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.indekslendigikaynak	TR-Dizin
dc.language.iso	en
dc.publisher	Tubitak Scientific & Technological Research Council Turkey
dc.relation.ispartof	Turkish Journal of Electrical Engineering and Computer Sciences
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/openAccess
dc.snmz	KA_20241211
dc.subject	Feature extraction
dc.subject	speaker recognition
dc.subject	speech recognition
dc.title	Scale-invariant MFCCs for speech/speaker recognition
dc.type	Article

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu
TR-Dizin İndeksli Yayınlar Koleksiyonu

Scale-invariant MFCCs for speech/speaker recognition

Dosyalar

Koleksiyon