Scaling Up: Revisiting Mining Android Sandboxes at Scale for Malware Classification
- URL: http://arxiv.org/abs/2505.09501v1
- Date: Wed, 14 May 2025 15:52:09 GMT
- Title: Scaling Up: Revisiting Mining Android Sandboxes at Scale for Malware Classification
- Authors: Francisco Costa, Ismael Medeiros, Leandro Oliveira, João Calássio, Rodrigo Bonifácio, Krishna Narasimhan, Mira Mezini, Márcio Ribeiro,
- Abstract summary: Mining Android Sandbox approach (MAS) aims to identify malicious behavior in repackaged Android applications (apps)<n>Previous empirical studies evaluated the MAS approach using a small dataset consisting of only 102 pairs of original and repackaged apps.<n>This paper presents the results of a replication study focused on evaluating the performance of the MAS approach regarding its capabilities of correctly classifying malware from different families.
- Score: 1.6445041519310273
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The widespread use of smartphones in daily life has raised concerns about privacy and security among researchers and practitioners. Privacy issues are generally highly prevalent in mobile applications, particularly targeting the Android platform, the most popular mobile operating system. For this reason, several techniques have been proposed to identify malicious behavior in Android applications, including the Mining Android Sandbox approach (MAS approach), which aims to identify malicious behavior in repackaged Android applications (apps). However, previous empirical studies evaluated the MAS approach using a small dataset consisting of only 102 pairs of original and repackaged apps. This limitation raises questions about the external validity of their findings and whether the MAS approach can be generalized to larger datasets. To address these concerns, this paper presents the results of a replication study focused on evaluating the performance of the MAS approach regarding its capabilities of correctly classifying malware from different families. Unlike previous studies, our research employs a dataset that is an order of magnitude larger, comprising 4,076 pairs of apps covering a more diverse range of Android malware families. Surprisingly, our findings indicate a poor performance of the MAS approach for identifying malware, with the F1-score decreasing from 0.90 for the small dataset used in the previous studies to 0.54 in our more extensive dataset. Upon closer examination, we discovered that certain malware families partially account for the low accuracy of the MAS approach, which fails to classify a repackaged version of an app as malware correctly. Our findings highlight the limitations of the MAS approach, particularly when scaled, and underscore the importance of complementing it with other techniques to detect a broader range of malware effectively.
Related papers
- VOLTRON: Detecting Unknown Malware Using Graph-Based Zero-Shot Learning [0.0]
The persistent threat of Android malware presents a serious challenge to the security of millions of users globally.<n>We introduce a novel zero-shot learning framework that combines Variational Graph Auto-Encoders (VGAE) with Siamese Neural Networks (SNN) to identify malware without needing prior examples of specific malware families.<n>Our model achieves 96.24% accuracy and 95.20% recall for unknown malware families, highlighting its robustness against evolving Android threats.
arXiv Detail & Related papers (2025-07-06T07:25:25Z) - Multi-label Classification for Android Malware Based on Active Learning [7.599125552187342]
We propose MLCDroid, an ML-based multi-label classification approach that can directly indicate the existence of pre-defined malicious behaviors.
We compare the results of 70 algorithm combinations to evaluate the effectiveness (best at 73.3%).
This is the first multi-label Android malware classification approach intending to provide more information on fine-grained malicious behaviors.
arXiv Detail & Related papers (2024-10-09T01:09:24Z) - MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware.
We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph.
This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z) - Revisiting Static Feature-Based Android Malware Detection [0.8192907805418583]
This paper highlights critical pitfalls that undermine the validity of machine learning research in Android malware detection.
We propose solutions for improving datasets and methodological practices, enabling fairer model comparisons.
Our paper aims to support future research in Android malware detection and other security domains, enhancing the reliability and validity of published results.
arXiv Detail & Related papers (2024-09-11T16:37:50Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines.
Academic research is often restrained to public datasets on the order of ten thousand samples.
We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z) - Android Malware Detection with Unbiased Confidence Guarantees [1.6432632226868131]
We propose a machine learning dynamic analysis approach that provides provably valid confidence guarantees in each malware detection.
The proposed approach is based on a novel machine learning framework, called Conformal Prediction, combined with a random forests classifier.
We examine its performance on a large-scale dataset collected by installing 1866 malicious and 4816 benign applications on a real android device.
arXiv Detail & Related papers (2023-12-17T11:07:31Z) - Malicious code detection in android: the role of sequence characteristics and disassembling methods [0.0]
We investigate and emphasize the factors that may affect the accuracy values of the models managed by researchers.
Our findings exhibit that the disassembly method and different input representations affect the model results.
arXiv Detail & Related papers (2023-12-02T11:55:05Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Investigating Feature and Model Importance in Android Malware Detection: An Implemented Survey and Experimental Comparison of ML-Based Methods [2.9248916859490173]
We show that high detection accuracies can be achieved using features extracted through static analysis alone.
Random forests are generally the most effective model, outperforming more complex deep learning approaches.
arXiv Detail & Related papers (2023-01-30T10:48:10Z) - Towards a Fair Comparison and Realistic Design and Evaluation Framework
of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework.
We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models.
We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z) - Maat: Automatically Analyzing VirusTotal for Accurate Labeling and
Effective Malware Detection [71.84087757644708]
The malware analysis and detection research community relies on the online platform VirusTotal to label Android apps based on the scan results of around 60 scanners.
There are no standards on how to best interpret the scan results acquired from VirusTotal, which leads to the utilization of different threshold-based labeling strategies.
We implemented a method, Maat, that tackles these issues of standardization and sustainability by automatically generating a Machine Learning (ML)-based labeling scheme.
arXiv Detail & Related papers (2020-07-01T14:15:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.