Adaptive Malware Detection using Sequential Feature Selection: A Dueling Double Deep Q-Network (D3QN) Framework for Intelligent Classification
- URL: http://arxiv.org/abs/2507.04372v1
- Date: Sun, 06 Jul 2025 12:37:50 GMT
- Title: Adaptive Malware Detection using Sequential Feature Selection: A Dueling Double Deep Q-Network (D3QN) Framework for Intelligent Classification
- Authors: Naseem Khan, Aref Y. Al-Tamimi, Amine Bermak, Issa M. Khalil,
- Abstract summary: We formulate malware classification as a Markov Decision Process with episodic feature acquisition.<n>We propose a Dueling Double Deep Q-Network (D3QN) framework for adaptive sequential feature selection.<n>We evaluate our approach on Microsoft Big2015 (9-class, 1,795 features) and BODMAS (binary, 2,381 features) datasets.
- Score: 1.4120905648647635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional malware detection methods exhibit computational inefficiency due to exhaustive feature extraction requirements, creating accuracy-efficiency trade-offs that limit real-time deployment. We formulate malware classification as a Markov Decision Process with episodic feature acquisition and propose a Dueling Double Deep Q-Network (D3QN) framework for adaptive sequential feature selection. The agent learns to dynamically select informative features per sample before terminating with classification decisions, optimizing both detection accuracy and computational cost through reinforcement learning. We evaluate our approach on Microsoft Big2015 (9-class, 1,795 features) and BODMAS (binary, 2,381 features) datasets. D3QN achieves 99.22% and 98.83% accuracy while utilizing only 61 and 56 features on average, representing 96.6% and 97.6% dimensionality reduction. This yields computational efficiency improvements of 30.1x and 42.5x over traditional ensemble methods. Comprehensive ablation studies demonstrate consistent superiority over Random Forest, XGBoost, and static feature selection approaches. Quantitative analysis demonstrates that D3QN learns non-random feature selection policies with 62.5% deviation from uniform baseline distributions. The learned policies exhibit structured hierarchical preferences, utilizing high-level metadata features for initial assessment while selectively incorporating detailed behavioral features based on classification uncertainty. Feature specialization analysis reveals 57.7% of examined features demonstrate significant class-specific discrimination patterns. Our results validate reinforcement learning-based sequential feature selection for malware classification, achieving superior accuracy with substantial computational reduction through learned adaptive policies.
Related papers
- Classifying Dental Care Providers Through Machine Learning with Features Ranking [0.0]
This study investigates the application of machine learning (ML) models for classifying dental providers.<n>The dataset includes service counts (preventive, treatment, exams), delivery systems (FFS, managed care), and beneficiary demographics.<n>The study underscores the importance of feature selection in enhancing model efficiency and accuracy.
arXiv Detail & Related papers (2025-06-04T21:45:40Z) - A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy.<n>We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods.<n>By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z) - Enhancing Classification Performance via Reinforcement Learning for
Feature Selection [0.0]
This study investigates the importance of effective feature selection in enhancing the performance of classification models.
By employing reinforcement learning (RL) algorithms, specifically Q-learning (QL) and SARSA learning, this paper addresses the feature selection challenge.
arXiv Detail & Related papers (2024-03-09T18:34:59Z) - Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses.
Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z) - Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation [56.13803674092712]
We propose an industrial-friendly, expert-aligned and diversity-preserved instruction data selection method: Clustering and Ranking (CaR)
CaR employs a two-step process: first, it ranks instruction pairs using a high-accuracy (84.25%) scoring model aligned with expert preferences; second, it preserves dataset diversity through clustering.
In our experiment, CaR efficiently selected a mere 1.96% of Alpaca's IT data, yet the resulting AlpaCaR model surpassed Alpaca's performance by an average of 32.1% in GPT-4 evaluations.
arXiv Detail & Related papers (2024-02-28T09:27:29Z) - Ransomware detection using stacked autoencoder for feature selection [0.0]
The study meticulously analyzes the autoencoder's learned weights and activations to identify essential features for distinguishing ransomware families from other malware.
The proposed model achieves an exceptional 99% accuracy in ransomware classification, surpassing the Extreme Gradient Boosting (XGBoost) algorithm.
arXiv Detail & Related papers (2024-02-17T17:31:48Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Leveraging Uncertainty Estimates To Improve Classifier Performance [4.4951754159063295]
Binary classification involves predicting the label of an instance based on whether the model score for the positive class exceeds a threshold chosen based on the application requirements.
However, model scores are often not aligned with the true positivity rate.
This is especially true when the training involves a differential sampling across classes or there is distributional drift between train and test settings.
arXiv Detail & Related papers (2023-11-20T12:40:25Z) - KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection [48.66703222700795]
We resort to a novel kernel strategy to identify the most informative point clouds to acquire labels.
To accommodate both one-stage (i.e., SECOND) and two-stage detectors, we incorporate the classification entropy tangent and well trade-off between detection performance and the total number of bounding boxes selected for annotation.
Our results show that approximately 44% box-level annotation costs and 26% computational time are reduced compared to the state-of-the-art method.
arXiv Detail & Related papers (2023-07-16T04:27:03Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Adaptive Threshold for Better Performance of the Recognition and
Re-identification Models [0.0]
An online optimization-based statistical feature learning adaptive technique is developed and tested on the LFW datasets and self-prepared athletes datasets.
This method of adopting adaptive threshold resulted in 12-45% improvement in the model accuracy compared to the fixed threshold 0.3,0.5,0.7 that are usually taken via the hit-and-trial method in any classification and identification tasks.
arXiv Detail & Related papers (2020-12-28T15:40:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.