SIFT: Sifting file types-application of explainable artificial intelligence in cyber forensics

Alam, Shahid; Demir, Alper Kamil

SIFT: Sifting file types-application of explainable artificial intelligence in cyber forensics

dc.authorid	Alam, Shahid/0000-0002-4080-8042
dc.contributor.author	Alam, Shahid
dc.contributor.author	Demir, Alper Kamil
dc.date.accessioned	2025-01-06T17:37:47Z
dc.date.available	2025-01-06T17:37:47Z
dc.date.issued	2024
dc.description.abstract	Artificial Intelligence (AI) is being applied to improve the efficiency of software systems used in various domains, especially in the health and forensic sciences. Explainable AI (XAI) is one of the fields of AI that interprets and explains the methods used in AI. One of the techniques used in XAI to provide such interpretations is by computing the relevance of the input features to the output of an AI model. File fragment classification is one of the vital issues of file carving in Cyber Forensics (CF) and becomes challenging when the filesystem metadata is missing. Other major challenges it faces are: proliferation of file formats, file embeddings, automation, We leverage and utilize interpretations provided by XAI to optimize the classification of file fragments and propose a novel sifting approach, named SIFT (Sifting File Types). SIFT employs TF-IDF to assign weight to a byte (feature), which is used to select features from a file fragment. Threshold-based LIME and SHAP (the two XAI techniques) feature relevance values are computed for the selected features to optimize file fragment classification. To improve multinomial classification, a Multilayer Perceptron model is developed and optimized with five hidden layers, each layer with ixn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i \times n$$\end{document} neurons, where i = the layer number and n = the total number of classes in the dataset. When tested with 47,482 samples of 20 file types (classes), SIFT achieves a detection rate of 82.1% and outperforms the other state-of-the-art techniques by at least 10%. To the best of our knowledge, this is the first effort of applying XAI in CF for optimizing file fragment classification.
dc.identifier.doi	10.1186/s42400-024-00241-9
dc.identifier.issn	2523-3246
dc.identifier.issue	1
dc.identifier.scopus	2-s2.0-85203455921
dc.identifier.scopusquality	Q1
dc.identifier.uri	https://doi.org/10.1186/s42400-024-00241-9
dc.identifier.uri	https://hdl.handle.net/20.500.14669/2358
dc.identifier.volume	7
dc.identifier.wos	WOS:001310156000001
dc.identifier.wosquality	N/A
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.language.iso	en
dc.publisher	Springernature
dc.relation.ispartof	Cybersecurity
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/openAccess
dc.snmz	KA_20241211
dc.subject	Explainable artificial intelligence
dc.subject	Deep learning
dc.subject	Cyber forensics
dc.subject	File fragment classification
dc.title	SIFT: Sifting file types-application of explainable artificial intelligence in cyber forensics
dc.type	Article

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

SIFT: Sifting file types-application of explainable artificial intelligence in cyber forensics

Dosyalar

Koleksiyon