DSpace Repository

A robust polynomial regression-based voice activity detector for speaker verification

Show simple item record

dc.contributor.author Disken, Gokay
dc.contributor.author Tufekci, Zekeriya
dc.contributor.author Cevik, Ulus
dc.date.accessioned 2019-11-19T06:22:28Z
dc.date.available 2019-11-19T06:22:28Z
dc.date.issued 2017-10
dc.identifier.citation Disken, G., Tufekci, Z., & Cevik, U. (2017). A robust polynomial regression-based voice activity detector for speaker verification. Eurasip Journal on Audio Speech and Music Processing. https://doi.org/10.1186/s13636-017-0120-6 tr_TR
dc.identifier.issn 1687-4722
dc.identifier.uri http://openaccess.adanabtu.edu.tr:8080/xmlui/handle/123456789/580
dc.identifier.uri https://doi.org/10.1186/s13636-017-0120-6
dc.description WOS indeksli yayınlar koleksiyonu. / WOS indexed publications collection.
dc.description.abstract Robustness against background noise is a major research area for speech-related applications such as speech recognition and speaker recognition. One of the many solutions for this problem is to detect speech-dominant regions by using a voice activity detector (VAD). In this paper, a second-order polynomial regression-based algorithm is proposed with a similar function as a VAD for text-independent speaker verification systems. The proposed method aims to separate steady noise/silence regions, steady speech regions, and speech onset/offset regions. The regression is applied independently to each filter band of a mel spectrum, which makes the algorithm fit seamlessly to the conventional extraction process of the mel-frequency cepstral coefficients (MFCCs). The kmeans algorithm is also applied to estimate average noise energy in each band for spectral subtraction. A pseudo SNR-dependent linear thresholding for the final VAD output decision is introduced based on the k-means energy centers. This thresholding considers the speech presence in each band. Conventional VADs usually neglect the deteriorative effects of the additive noise in the speech regions. Contrary to this, the proposed method decides not only for the speech presence, but also if the frame is dominated by the speech, or the noise. Performance of the proposed algorithm is compared with a continuous noise tracking method, and another VAD method in speaker verification experiments, where five different noise types at five different SNR levels were considered. The proposed algorithm showed superior verification performance both with the conventional GMM-UBM method, and the stateof- the-art i-vector method. tr_TR
dc.language.iso en tr_TR
dc.publisher EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING / SPRINGER INTERNATIONAL PUBLISHING AG tr_TR
dc.relation.ispartofseries 2017;23
dc.subject Polynomial regression tr_TR
dc.subject Robust speaker recognition
dc.subject Voice activity detection
dc.subject AUTOMATIC SPEECH RECOGNITION
dc.subject NOISE
dc.subject FEATURES
dc.subject ENHANCEMENT
dc.subject MODELS
dc.subject MFCC
dc.subject ENVIRONMENTS
dc.subject INFORMATION
dc.subject PERSPECTIVE
dc.subject ALGORITHMS
dc.subject Acoustics
dc.subject Engineering
dc.subject Electrical & Electronic
dc.title A robust polynomial regression-based voice activity detector for speaker verification tr_TR
dc.type Article tr_TR


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account