Benchmarking TabNet, NODE, and FT-Transformer for Software Defect Prediction: An Empirical Comparison and Explainability Analysis

[ X ]

Tarih

2026

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

IEEE-Inst Electrical Electronics Engineers Inc

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

Software defect prediction (SDP) is essential for improving software quality and reliability. Traditional machine learning methods, while effective, often fail in capturing complex interactions among software metrics. Recently, specialized deep learning architectures designed for tabular data, including TabNet, Neural Oblivious Decision Ensembles (NODE), and FT-Transformer, have emerged, offering promising potential to enhance prediction accuracy and interpretability. This study comprehensively benchmarks the TabNet, NODE and FT-Transformer models on the challenging NASA JM1 dataset from the PROMISE repository. We address severe class imbalance using NearMiss undersampling and ensure hyperparameter optimization for fairness across comparisons. The performance of the models was evaluated using standard metrics: precision, recall, F1-score, and accuracy. In addition, the interpretability of the model was assessed using SHAP and LIME methods. The FT-Transformer and NODE models demonstrated superior performance, achieving 88% accuracy compared to the accuracy of TabNet 86%. FT-Transformer showed exceptional precision (99%) for defect detection, emphasizing its low false-positive rate. SHAP and LIME analyzes revealed unique attention patterns for each model, highlighting differences in feature importance and decision-making processes. FT-Transformer and NODE outperform TabNet in accuracy and balance between recall and precision. Interpretability analysis provides actionable insights into feature importance, enabling better decision-making in practical SDP workflows.

Açıklama

Anahtar Kelimeler

Software, Measurement, Predictive models, Codes, Feature extraction, Deep learning, Benchmark testing, Accuracy, Machine learning, Biological system modeling, Class imbalance, deep learning, explainable AI, FT-Transformer, LIME, NODE, PROMISE dataset, SHAP, software defect prediction, TabNet, tabular data

Kaynak

IEEEAccess

WoS Q Değeri

Scopus Q Değeri

Cilt

14

Sayı

Künye