Scale-invariant MFCCs for speech/speaker recognition
dc.contributor.author | Tufekci, Zekeriya | |
dc.contributor.author | Disken, Gokay | |
dc.date.accessioned | 2025-01-06T17:37:50Z | |
dc.date.available | 2025-01-06T17:37:50Z | |
dc.date.issued | 2019 | |
dc.description.abstract | The feature extraction process is a fundamental part of speech processing. Mel frequency cepstral coefficients (MFCCs) are the most commonly used feature types in the speech/speaker recognition literature. However, the MFCC framework may face numerical issues or dynamic range problems, which decreases their performance. A practical solution to these problems is adding a constant to filter-bank magnitudes before log compression, thus violating the scale-invariant property. In this work, a magnitude normalization and a multiplication constant are introduced to make the MFCCs scale-invariant and to avoid dynamic range expansion of nonspeech frames. Speaker verification experiments are conducted to show the effectiveness of the proposed scheme. | |
dc.identifier.doi | 10.3906/elk-1901-231 | |
dc.identifier.endpage | 3762 | |
dc.identifier.issn | 1300-0632 | |
dc.identifier.issn | 1303-6203 | |
dc.identifier.issue | 5 | |
dc.identifier.scopus | 2-s2.0-85072597726 | |
dc.identifier.scopusquality | Q2 | |
dc.identifier.startpage | 3758 | |
dc.identifier.trdizinid | 337504 | |
dc.identifier.uri | https://doi.org/10.3906/elk-1901-231 | |
dc.identifier.uri | https://search.trdizin.gov.tr/tr/yayin/detay/337504 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14669/2388 | |
dc.identifier.volume | 27 | |
dc.identifier.wos | WOS:000486425400034 | |
dc.identifier.wosquality | Q4 | |
dc.indekslendigikaynak | Web of Science | |
dc.indekslendigikaynak | Scopus | |
dc.indekslendigikaynak | TR-Dizin | |
dc.language.iso | en | |
dc.publisher | Tubitak Scientific & Technological Research Council Turkey | |
dc.relation.ispartof | Turkish Journal of Electrical Engineering and Computer Sciences | |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | |
dc.rights | info:eu-repo/semantics/openAccess | |
dc.snmz | KA_20241211 | |
dc.subject | Feature extraction | |
dc.subject | speaker recognition | |
dc.subject | speech recognition | |
dc.title | Scale-invariant MFCCs for speech/speaker recognition | |
dc.type | Article |