Related papers: Why an Android App is Classified as Malware? Towards Malware Classification Interpretation

Why an Android App is Classified as Malware? Towards Malware Classification Interpretation

URL: http://arxiv.org/abs/2004.11516v2
Date: Fri, 4 Sep 2020 13:27:46 GMT
Title: Why an Android App is Classified as Malware? Towards Malware Classification Interpretation
Authors: Bozhi Wu, Sen Chen, Cuiyun Gao, Lingling Fan, Yang Liu, Weiping Wen, Michael R. Lyu
Abstract summary: We propose a novel and interpretable ML-based approach (named XMal) to classify malware with high accuracy and explain the classification result. XMal hinges multi-layer perceptron (MLP) and attention mechanism, and also pinpoints the key features most related to the classification result. Our study peeks into the interpretable ML through the research of Android malware detection and analysis.
Score: 34.59397128785141
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning (ML) based approach is considered as one of the most promising techniques for Android malware detection and has achieved high accuracy by leveraging commonly-used features. In practice, most of the ML classifications only provide a binary label to mobile users and app security analysts. However, stakeholders are more interested in the reason why apps are classified as malicious in both academia and industry. This belongs to the research area of interpretable ML but in a specific research domain (i.e., mobile malware detection). Although several interpretable ML methods have been exhibited to explain the final classification results in many cutting-edge Artificial Intelligent (AI) based research fields, till now, there is no study interpreting why an app is classified as malware or unveiling the domain-specific challenges. In this paper, to fill this gap, we propose a novel and interpretable ML-based approach (named XMal) to classify malware with high accuracy and explain the classification result meanwhile. (1) The first classification phase of XMal hinges multi-layer perceptron (MLP) and attention mechanism, and also pinpoints the key features most related to the classification result. (2) The second interpreting phase aims at automatically producing neural language descriptions to interpret the core malicious behaviors within apps. We evaluate the behavior description results by comparing with the existing interpretable ML-based methods (i.e., Drebin and LIME) to demonstrate the effectiveness of XMal. We find that XMal is able to reveal the malicious behaviors more accurately. Additionally, our experiments show that XMal can also interpret the reason why some samples are misclassified by ML classifiers. Our study peeks into the interpretable ML through the research of Android malware detection and analysis.

Related papers

ForenX: Towards Explainable AI-Generated Image Detection with Multimodal Large Language Models [82.04858317800097]
We present ForenX, a novel method that not only identifies the authenticity of images but also provides explanations that resonate with human thoughts.<n>ForenX employs the powerful multimodal large language models (MLLMs) to analyze and interpret forensic cues.<n>We introduce ForgReason, a dataset dedicated to descriptions of forgery evidences in AI-generated images.
arXiv Detail & Related papers (2025-08-02T15:21:26Z)
Multi-label Classification for Android Malware Based on Active Learning [7.599125552187342]
We propose MLCDroid, an ML-based multi-label classification approach that can directly indicate the existence of pre-defined malicious behaviors. We compare the results of 70 algorithm combinations to evaluate the effectiveness (best at 73.3%). This is the first multi-label Android malware classification approach intending to provide more information on fine-grained malicious behaviors.
arXiv Detail & Related papers (2024-10-09T01:09:24Z)
MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries? [70.77691645678804]
Humans are prone to cognitive distortions -- biased thinking patterns that lead to exaggerated responses to specific stimuli. This paper demonstrates that advanced Multimodal Large Language Models (MLLMs) exhibit similar tendencies. We identify three types of stimuli that trigger the oversensitivity of existing MLLMs: Exaggerated Risk, Negated Harm, and Counterintuitive.
arXiv Detail & Related papers (2024-06-22T23:26:07Z)
CausalGym: Benchmarking causal interpretability methods on linguistic tasks [52.61917615039112]
We use CausalGym to benchmark the ability of interpretability methods to causally affect model behaviour. We study the pythia models (14M--6.9B) and assess the causal efficacy of a wide range of interpretability methods. We find that DAS outperforms the other methods, and so we use it to study the learning trajectory of two difficult linguistic phenomena.
arXiv Detail & Related papers (2024-02-19T21:35:56Z)
Unraveling the Key of Machine Learning Solutions for Android Malware Detection [33.63795751798441]
This paper presents a comprehensive investigation into machine learning-based Android malware detection. We first survey the literature, categorizing contributions into a taxonomy based on the Android feature engineering and ML modeling pipeline. Then, we design a general-propose framework for ML-based Android malware detection, re-implement 12 representative approaches from different research communities, and evaluate them from three primary dimensions, i.e. effectiveness, robustness, and efficiency.
arXiv Detail & Related papers (2024-02-05T12:31:19Z)
Vulnerability of Machine Learning Approaches Applied in IoT-based Smart Grid: A Review [51.31851488650698]
Machine learning (ML) sees an increasing prevalence of being used in the internet-of-things (IoT)-based smart grid. adversarial distortion injected into the power signal will greatly affect the system's normal control and operation. It is imperative to conduct vulnerability assessment for MLsgAPPs applied in the context of safety-critical power systems.
arXiv Detail & Related papers (2023-08-30T03:29:26Z)
Quantum Machine Learning for Malware Classification [0.0]
In a context of malicious software detection, machine learning is widely used to generalize to new malware. It has been demonstrated that ML models can be fooled or may have generalization problems on malware that has never been seen. We implement two models of Quantum Machine Learning algorithms, and we compare them to classical models for the classification of a dataset composed of malicious and benign executable files.
arXiv Detail & Related papers (2023-05-09T09:21:48Z)
Towards a Fair Comparison and Realistic Design and Evaluation Framework of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework. We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models. We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z)
MERLIN -- Malware Evasion with Reinforcement LearnINg [26.500149465292246]
We propose a method using reinforcement learning with DQN and REINFORCE algorithms to challenge two state-of-the-art malware detection engines. Our method combines several actions, modifying a Windows portable execution file without breaking its functionalities. We demonstrate that REINFORCE achieves very good evasion rates even on a commercial AV with limited available information.
arXiv Detail & Related papers (2022-03-24T10:58:47Z)
Towards interpreting ML-based automated malware detection models: a survey [4.721069729610892]
Most of the existing machine learning models are black-box, which made their pre-diction results undependable. This paper aims to examine and categorize the existing researches on ML-based malware detector interpretability.
arXiv Detail & Related papers (2021-01-15T17:34:40Z)
Maat: Automatically Analyzing VirusTotal for Accurate Labeling and Effective Malware Detection [71.84087757644708]
The malware analysis and detection research community relies on the online platform VirusTotal to label Android apps based on the scan results of around 60 scanners. There are no standards on how to best interpret the scan results acquired from VirusTotal, which leads to the utilization of different threshold-based labeling strategies. We implemented a method, Maat, that tackles these issues of standardization and sustainability by automatically generating a Machine Learning (ML)-based labeling scheme.
arXiv Detail & Related papers (2020-07-01T14:15:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.