Using BERT models for breast cancer diagnosis from Turkish radiology reports

Hepsag, Pinar Uskaner; Ozel, Selma Ayse; Dalci, Kubilay; Yazici, Adnan

Using BERT models for breast cancer diagnosis from Turkish radiology reports

Tarih

2024

Yazarlar

Hepsag, Pinar Uskaner

Ozel, Selma Ayse

Dalci, Kubilay

Yazici, Adnan

Yayıncı

Springer

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Diagnostic radiology is concerned with obtaining images of the internal organs using radiological imaging procedures. These images are then interpreted by a diagnostic radiologist, who produces a textual report that assists in the diagnosis of illness or injury. Early detection of certain illnesses, particularly cancer, is critical, and the reports produced by diagnostic radiologists play a key role in this process. To develop models for the early detection of cancer, text classification techniques can be applied to radiological reports. However, this process requires access to a dataset of radiology reports, which is not widely available. Currently, radiology report datasets exist for high-resource languages such as English and Dutch, but not for low-resource languages such as Turkish. This article describes the collection of a mammography report dataset for Turkish, consisting of 62 reports from real patients that were manually labeled by an expert for diagnosing breast cancer. Basic machine learning models were applied to this dataset using pre-trained BERT, DistilBERT, and an ensemble learning hard voting approach. The results showed that BERT on Turkish achieved the best performance, with a 91% F1-score. Hard Voting, which combined the results of BERTTurkish, BERTClinical, and BERTMultilingual, achieved the highest F1-score of 93%. The results show that BERT and Hard Voting outperform the other machine learning models for breast cancer diagnosis from Turkish radiology reports.

Anahtar Kelimeler

Turkish dataset, Breast cancer, Contextualized word embeddings, Radiology reports, Machine learning

Kaynak

Language Resources and Evaluation

WoS Q Değeri

Q3

Scopus Q Değeri

Q1

Cilt

58

Sayı

3

Bağlantı

https://doi.org/10.1007/s10579-023-09669-w
https://hdl.handle.net/20.500.14669/3147

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

Using BERT models for breast cancer diagnosis from Turkish radiology reports

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Özet

Açıklama

Anahtar Kelimeler

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Bağlantı

Koleksiyon