Applying Natural Language Processing for detecting malicious patterns in Android applications

Alam, Shahid

Applying Natural Language Processing for detecting malicious patterns in Android applications

dc.authorid	Alam, Shahid/0000-0002-4080-8042
dc.contributor.author	Alam, Shahid
dc.date.accessioned	2025-01-06T17:43:46Z
dc.date.available	2025-01-06T17:43:46Z
dc.date.issued	2021
dc.description.abstract	With increasing quantity and sophistication, malicious code is becoming difficult to discover and analyze. Modern NLP (Natural Language Processing) techniques have significantly improved, and are being used in practice to accomplish various tasks. Recently, many research works have applied NLP for finding ma-licious patterns in Android and Windows apps. In this paper, we exploit this fact and apply NLP tech-niques to an intermediate representation (MAIL e Malware analysis intermediate language) of Android apps to build a similarity index model, named SIMP. We use SIMP to find malicious patterns in Android apps. MAIL provides control flow patterns to enhance the malware analysis and makes the code accessible to NLP techniques for checking semantic similarities. For applying NLP, we consider a MAIL program as one document. The control flow patterns in this program when divided, into specific blocks (words), become sentences. We apply TFIDF and Bag-of-Words over these control flow patterns to build SIMP. Our proposed model, when tested with real malware and benign Android apps using different validation methods, achieved an MCC (Mathews Correlation Coefficient) > 0.94 between the true and predicted values. That indicates, predicting a new sample either as malware or benign with a high success rate. (c) 2021 Elsevier Ltd. All rights reserved.
dc.identifier.doi	10.1016/j.fsidi.2021.301270
dc.identifier.issn	2666-2817
dc.identifier.scopus	2-s2.0-85122659800
dc.identifier.scopusquality	Q1
dc.identifier.uri	https://doi.org/10.1016/j.fsidi.2021.301270
dc.identifier.uri	https://hdl.handle.net/20.500.14669/2790
dc.identifier.volume	39
dc.identifier.wos	WOS:000709481500004
dc.identifier.wosquality	Q4
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.language.iso	en
dc.publisher	Elsevier Sci Ltd
dc.relation.ispartof	Forensic Science International-Digital Investigation
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/closedAccess
dc.snmz	KA_20241211
dc.subject	Natural language processing
dc.subject	Android applications
dc.subject	Control flow patterns
dc.subject	Intermediate language
dc.subject	Malicious patterns
dc.title	Applying Natural Language Processing for detecting malicious patterns in Android applications
dc.type	Article

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Applying Natural Language Processing for detecting malicious patterns in Android applications

Dosyalar

Koleksiyon