DWFS-Obfuscation: Dynamic Weighted Feature Selection for Robust Malware Familial Classification under Obfuscation
- URL: http://arxiv.org/abs/2504.07590v1
- Date: Thu, 10 Apr 2025 09:37:43 GMT
- Title: DWFS-Obfuscation: Dynamic Weighted Feature Selection for Robust Malware Familial Classification under Obfuscation
- Authors: Xingyuan Wei, Zijun Cheng, Ning Li, Qiujian Lv, Ziyang Yu, Degang Sun,
- Abstract summary: We propose a dynamic weighted feature selection method that analyzes the importance and stability of features.<n>We then utilize graph neural networks for classification, thereby improving the robustness and accuracy of the detection system.<n> Experiments demonstrate that our proposed method achieves an F1-score of 95.56% on the unobfuscated dataset and 92.28% on the obfuscated dataset.
- Score: 1.4742348415878401
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to its open-source nature, the Android operating system has consistently been a primary target for attackers. Learning-based methods have made significant progress in the field of Android malware detection. However, traditional detection methods based on static features struggle to identify obfuscated malicious code, while methods relying on dynamic analysis suffer from low efficiency. To address this, we propose a dynamic weighted feature selection method that analyzes the importance and stability of features, calculates scores to filter out the most robust features, and combines these selected features with the program's structural information. We then utilize graph neural networks for classification, thereby improving the robustness and accuracy of the detection system. We analyzed 8,664 malware samples from eight malware families and tested a total of 44,940 malware variants generated using seven obfuscation strategies. Experiments demonstrate that our proposed method achieves an F1-score of 95.56% on the unobfuscated dataset and 92.28% on the obfuscated dataset, indicating that the model can effectively detect obfuscated malware.
Related papers
- Imbalanced malware classification: an approach based on dynamic classifier selection [0.0]
A significant challenge in malware detection is the imbalance in datasets, where most applications are benign, with only a small fraction posing a threat.<n>This study addresses the often-overlooked issue of class imbalance in malware detection by evaluating various machine learning strategies for detecting malware in Android applications.
arXiv Detail & Related papers (2025-03-30T19:12:16Z) - A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy.<n>We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods.<n>By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z) - A Novel Approach to Malicious Code Detection Using CNN-BiLSTM and Feature Fusion [2.3039261241391586]
This study employs the minhash algorithm to convert binary files of malware into grayscale images.
The study utilizes IDA Pro to decompile and extract opcode sequences, applying N-gram and tf-idf algorithms for feature vectorization.
A CNN-BiLSTM fusion model is designed to simultaneously process image features and opcode sequences, enhancing classification performance.
arXiv Detail & Related papers (2024-10-12T07:10:44Z) - Towards Robust IoT Defense: Comparative Statistics of Attack Detection in Resource-Constrained Scenarios [1.3812010983144802]
Resource constraints pose a significant cybersecurity threat to IoT smart devices.
We conduct an extensive statistical analysis of cyberattack detection algorithms under resource constraints to identify the most efficient one.
arXiv Detail & Related papers (2024-10-10T10:58:03Z) - MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware.
We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph.
This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z) - Challenging Machine Learning Algorithms in Predicting Vulnerable JavaScript Functions [2.243674903279612]
State-of-the-art machine learning techniques can predict functions with possible security vulnerabilities in JavaScript programs.
Best performing algorithm was KNN, which created a model for the prediction of vulnerable functions with an F-measure of 0.76.
Deep learning, tree and forest based classifiers, and SVM were competitive with F-measures over 0.70.
arXiv Detail & Related papers (2024-05-12T08:23:42Z) - Android Malware Detection with Unbiased Confidence Guarantees [1.6432632226868131]
We propose a machine learning dynamic analysis approach that provides provably valid confidence guarantees in each malware detection.
The proposed approach is based on a novel machine learning framework, called Conformal Prediction, combined with a random forests classifier.
We examine its performance on a large-scale dataset collected by installing 1866 malicious and 4816 benign applications on a real android device.
arXiv Detail & Related papers (2023-12-17T11:07:31Z) - Light up that Droid! On the Effectiveness of Static Analysis Features
against App Obfuscation for Android Malware Detection [42.50353398405467]
Malware authors have seen obfuscation as the mean to bypass malware detectors based on static analysis features.
In this article we assess the impact of specific obfuscation techniques on common features extracted using static analysis.
We propose a ML malware detector for Android that is robust against obfuscation and outperforms current state-of-the-art detectors.
arXiv Detail & Related papers (2023-10-24T09:07:23Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - Bayesian Optimization with Machine Learning Algorithms Towards Anomaly
Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique.
The performance of the considered algorithms is evaluated using the ISCX 2012 dataset.
Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.