DSpace Repository

Experimental Comparison of Pre-Trained Word Embedding Vectors of Word2Vec, Glove, FastText for Word Level Semantic Text Similarity Measurement in Turkish

Show simple item record

dc.contributor.author Tulu, Cagatay Neftali
dc.date.accessioned 2023-04-18T08:15:10Z
dc.date.available 2023-04-18T08:15:10Z
dc.date.issued 2022-10
dc.identifier.citation Tulu, C. N. (2022). Experimental Comparison of Pre-Trained Word Embedding Vectors of Word2Vec, Glove, FastText for Word Level Semantic Text Similarity Measurement in Turkish. Advances in Science and Technology Research Journal, 16(4), 147-156. https://doi.org/10.12913/22998624/152453 tr_TR
dc.identifier.issn 2080-4075
dc.identifier.issn 2299-8624
dc.identifier.uri http://openacccess.atu.edu.tr:8080/xmlui/handle/123456789/4195
dc.identifier.uri http://dx.doi.org/10.12913/22998624/152453
dc.description WOS indeksli yayınlar koleksiyonu. / WOS indexed publications collection. tr_TR
dc.description.abstract This study aims to evaluate experimentally the word vectors produced by three widely used embedding methods for the word-level semantic text similarity in Turkish. Three benchmark datasets SimTurk, AnlamVer, and RG65_Turkce are used in this study to evaluate the word embedding vectors produced by three different methods namely Word2Vec, Glove, and FastText. As a result of the comparative analysis, Turkish word vectors produced with Glove and FastText gained better correlation in the word level semantic similarity. It is also found that The Turkish word coverage of FastText is ahead of the other two methods because the limited number of Out of Vocabulary (OOV) words have been observed in the experiments conducted for FastText. Another observation is that FastText and Glove vectors showed great success in terms of Spearman correlation value in the SimTurk and AnlamVer datasets both of which are purely prepared and evaluated by local Turkish individuals. This is another indicator showing that these aforementioned datasets are better representing the Turkish language in terms of morphology and inflections. tr_TR
dc.language.iso en tr_TR
dc.publisher ADVANCES IN SCIENCE AND TECHNOLOGY-RESEARCH JOURNAL / LUBLIN UNIVERSITY TECHNOLOGY, POLAND tr_TR
dc.relation.ispartofseries 2022;Volume: 16 Issue: 4
dc.subject semantic word similarity tr_TR
dc.subject word embeddings tr_TR
dc.subject Turkish NLP tr_TR
dc.title Experimental Comparison of Pre-Trained Word Embedding Vectors of Word2Vec, Glove, FastText for Word Level Semantic Text Similarity Measurement in Turkish tr_TR
dc.type Article tr_TR


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account