Optimized Deep Learning Models for Malware Detection under Concept Drift
- URL: http://arxiv.org/abs/2308.10821v2
- Date: Thu, 1 Aug 2024 13:53:48 GMT
- Title: Optimized Deep Learning Models for Malware Detection under Concept Drift
- Authors: William Maillet, Benjamin Marais,
- Abstract summary: We propose a model-agnostic protocol to improve a baseline neural network against drift.
We show the importance of feature reduction and training with the most recent validation set possible, and propose a loss function named Drift-Resilient Binary Cross-Entropy.
Our improved model shows promising results, detecting 15.2% more malware than a baseline model.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the promising results of machine learning models in malicious files detection, they face the problem of concept drift due to their constant evolution. This leads to declining performance over time, as the data distribution of the new files differs from the training one, requiring frequent model update. In this work, we propose a model-agnostic protocol to improve a baseline neural network against drift. We show the importance of feature reduction and training with the most recent validation set possible, and propose a loss function named Drift-Resilient Binary Cross-Entropy, an improvement to the classical Binary Cross-Entropy more effective against drift. We train our model on the EMBER dataset, published in2018, and evaluate it on a dataset of recent malicious files, collected between 2020 and 2023. Our improved model shows promising results, detecting 15.2% more malware than a baseline model.
Related papers
- Improving Malware Detection with Adversarial Domain Adaptation and Control Flow Graphs [10.352741619176383]
Existing solutions to combat concept drift use active learning.
We propose a method that learns retained information in malware control flow graphs post-drift by leveraging graph neural network.
Our approach demonstrates a significant enhancement in predicting unseen malware family in a binary classification task and predicting drifted malware families in a multi-class setting.
arXiv Detail & Related papers (2024-07-18T22:06:20Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - Towards a robust and reliable deep learning approach for detection of
compact binary mergers in gravitational wave data [0.0]
We develop a deep learning model stage-wise and work towards improving its robustness and reliability.
We retrain the model in a novel framework involving a generative adversarial network (GAN)
Although absolute robustness is practically impossible to achieve, we demonstrate some fundamental improvements earned through such training.
arXiv Detail & Related papers (2023-06-20T18:00:05Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Autoregressive based Drift Detection Method [0.0]
We propose a new concept drift detection method based on autoregressive models called ADDM.
Our results show that this new concept drift detection method outperforms the state-of-the-art drift detection methods.
arXiv Detail & Related papers (2022-03-09T14:36:16Z) - Collision Detection: An Improved Deep Learning Approach Using SENet and
ResNext [6.736699393205048]
In this article, a deep-learning-based model comprising of ResNext architecture with SENet blocks is proposed.
The proposed model outperforms the existing baseline models achieving a ROC-AUC of 0.91 using a significantly less proportion of the GTACrash synthetic data for training.
arXiv Detail & Related papers (2022-01-13T02:10:14Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn.
We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Dynamic Model Pruning with Feedback [64.019079257231]
We propose a novel model compression method that generates a sparse trained model without additional overhead.
We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models.
arXiv Detail & Related papers (2020-06-12T15:07:08Z) - An Efficient Method of Training Small Models for Regression Problems
with Knowledge Distillation [1.433758865948252]
We propose a new formalism of knowledge distillation for regression problems.
First, we propose a new loss function, teacher outlier loss rejection, which rejects outliers in training samples using teacher model predictions.
By considering the multi-task network, training of the feature extraction of student models becomes more effective.
arXiv Detail & Related papers (2020-02-28T08:46:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.