SIFT: Sifting file types-application of explainable artificial intelligence in cyber forensics

dc.authoridAlam, Shahid/0000-0002-4080-8042
dc.contributor.authorAlam, Shahid
dc.contributor.authorDemir, Alper Kamil
dc.date.accessioned2025-01-06T17:37:47Z
dc.date.available2025-01-06T17:37:47Z
dc.date.issued2024
dc.description.abstractArtificial Intelligence (AI) is being applied to improve the efficiency of software systems used in various domains, especially in the health and forensic sciences. Explainable AI (XAI) is one of the fields of AI that interprets and explains the methods used in AI. One of the techniques used in XAI to provide such interpretations is by computing the relevance of the input features to the output of an AI model. File fragment classification is one of the vital issues of file carving in Cyber Forensics (CF) and becomes challenging when the filesystem metadata is missing. Other major challenges it faces are: proliferation of file formats, file embeddings, automation, We leverage and utilize interpretations provided by XAI to optimize the classification of file fragments and propose a novel sifting approach, named SIFT (Sifting File Types). SIFT employs TF-IDF to assign weight to a byte (feature), which is used to select features from a file fragment. Threshold-based LIME and SHAP (the two XAI techniques) feature relevance values are computed for the selected features to optimize file fragment classification. To improve multinomial classification, a Multilayer Perceptron model is developed and optimized with five hidden layers, each layer with ixn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i \times n$$\end{document} neurons, where i = the layer number and n = the total number of classes in the dataset. When tested with 47,482 samples of 20 file types (classes), SIFT achieves a detection rate of 82.1% and outperforms the other state-of-the-art techniques by at least 10%. To the best of our knowledge, this is the first effort of applying XAI in CF for optimizing file fragment classification.
dc.identifier.doi10.1186/s42400-024-00241-9
dc.identifier.issn2523-3246
dc.identifier.issue1
dc.identifier.scopus2-s2.0-85203455921
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.1186/s42400-024-00241-9
dc.identifier.urihttps://hdl.handle.net/20.500.14669/2358
dc.identifier.volume7
dc.identifier.wosWOS:001310156000001
dc.identifier.wosqualityN/A
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherSpringernature
dc.relation.ispartofCybersecurity
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_20241211
dc.subjectExplainable artificial intelligence
dc.subjectDeep learning
dc.subjectCyber forensics
dc.subjectFile fragment classification
dc.titleSIFT: Sifting file types-application of explainable artificial intelligence in cyber forensics
dc.typeArticle

Dosyalar