LAM: Scrutinizing Leading APIs For Detecting Suspicious Call Sequences

Alam, Shahid2025-01-062025-01-0620230010-46201460-206710.1093/comjnl/bxac1102-s2.0-85178342936https://doi.org/10.1093/comjnl/bxac110https://hdl.handle.net/20.500.14669/2668The proliferation of smartphones has given exponential rise to the number of new mobile malware. These malware programs are employing stealthy obfuscations to hide their malicious activities. To perform malicious activities a program must make application programming interface (API) calls. Unlike dynamic, static analysis can find all the API call paths but have some issues: large number of features; higher false positives when features reduced; and lowering false positives increases the detection rate. Certain Android API calls, e.g. android.app.Activity:boolean requestWindowFeature(int) enable malware programs to call other APIs to hide their activities. We call them leading APIs as they can lead to malicious activities. To overcome these issues, we propose new heuristics and feature groupings for building a Leading API-call Map, named LAM. We create LAM from a dominant (leading) API call tree. Dominance is a transitive relation and hence enumerates all the call sequences that a leading API leads to. LAM substantially reduces the number and improves the quality of features for combating obfuscations and detecting suspicious call sequences with few false positives. For the dataset used in this paper, LAM reduced the number of features from 509 607 to 29 977. Using 10-fold cross-validation, LAM achieved an accuracy of 97.9% with 0.4% false positives.eninfo:eu-repo/semantics/closedAccessLeading APIsSuspicious call sequencesMalware analysis and detectionHeuristicsMachine learningLAM: Scrutinizing Leading APIs For Detecting Suspicious Call SequencesArticle265511Q2263866WOS:000833571400001Q3