Benchmarking TabNet, NODE, and FT-Transformer for Software Defect Prediction: An Empirical Comparison and Explainability Analysis

Asal, Burcak; Yalciner, Burcu

Benchmarking TabNet, NODE, and FT-Transformer for Software Defect Prediction: An Empirical Comparison and Explainability Analysis

dc.contributor.author	Asal, Burcak
dc.contributor.author	Yalciner, Burcu
dc.date.accessioned	2026-02-27T07:33:40Z
dc.date.available	2026-02-27T07:33:40Z
dc.date.issued	2026
dc.description.abstract	Software defect prediction (SDP) is essential for improving software quality and reliability. Traditional machine learning methods, while effective, often fail in capturing complex interactions among software metrics. Recently, specialized deep learning architectures designed for tabular data, including TabNet, Neural Oblivious Decision Ensembles (NODE), and FT-Transformer, have emerged, offering promising potential to enhance prediction accuracy and interpretability. This study comprehensively benchmarks the TabNet, NODE and FT-Transformer models on the challenging NASA JM1 dataset from the PROMISE repository. We address severe class imbalance using NearMiss undersampling and ensure hyperparameter optimization for fairness across comparisons. The performance of the models was evaluated using standard metrics: precision, recall, F1-score, and accuracy. In addition, the interpretability of the model was assessed using SHAP and LIME methods. The FT-Transformer and NODE models demonstrated superior performance, achieving 88% accuracy compared to the accuracy of TabNet 86%. FT-Transformer showed exceptional precision (99%) for defect detection, emphasizing its low false-positive rate. SHAP and LIME analyzes revealed unique attention patterns for each model, highlighting differences in feature importance and decision-making processes. FT-Transformer and NODE outperform TabNet in accuracy and balance between recall and precision. Interpretability analysis provides actionable insights into feature importance, enabling better decision-making in practical SDP workflows.
dc.identifier.doi	10.1109/ACCESS.2026.3656247
dc.identifier.endpage	11681
dc.identifier.issn	2169-3536
dc.identifier.startpage	11660
dc.identifier.uri	http://dx.doi.org/10.1109/ACCESS.2026.3656247
dc.identifier.uri	https://hdl.handle.net/20.500.14669/4662
dc.identifier.volume	14
dc.identifier.wos	WOS:001673759200007
dc.indekslendigikaynak	Web of Science
dc.language.iso	en
dc.publisher	IEEE-Inst Electrical Electronics Engineers Inc
dc.relation.ispartof	IEEEAccess
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/openAccess
dc.snmz	KA_20260302
dc.subject	Software
dc.subject	Measurement
dc.subject	Predictive models
dc.subject	Codes
dc.subject	Feature extraction
dc.subject	Deep learning
dc.subject	Benchmark testing
dc.subject	Accuracy
dc.subject	Machine learning
dc.subject	Biological system modeling
dc.subject	Class imbalance
dc.subject	deep learning
dc.subject	explainable AI
dc.subject	FT-Transformer
dc.subject	LIME
dc.subject	NODE
dc.subject	PROMISE dataset
dc.subject	SHAP
dc.subject	software defect prediction
dc.subject	TabNet
dc.subject	tabular data
dc.title	Benchmarking TabNet, NODE, and FT-Transformer for Software Defect Prediction: An Empirical Comparison and Explainability Analysis
dc.type	Article

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu

Benchmarking TabNet, NODE, and FT-Transformer for Software Defect Prediction: An Empirical Comparison and Explainability Analysis

Dosyalar

Koleksiyon