Related papers: Neural Networks Optimizations Against Concept and Data Drift in Malware Detection

Neural Networks Optimizations Against Concept and Data Drift in Malware Detection

URL: http://arxiv.org/abs/2308.10821v1
Date: Mon, 21 Aug 2023 16:13:23 GMT
Title: Neural Networks Optimizations Against Concept and Data Drift in Malware Detection
Authors: William Maillet and Benjamin Marais
Abstract summary: We propose a model-agnostic protocol to improve a baseline neural network to handle with the drift problem. We show the importance of feature reduction and training with the most recent validation set possible, and propose a loss function named Drift-Resilient Binary Cross-Entropy. Our improved model shows promising results, detecting 15.2% more malware than a baseline model.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the promising results of machine learning models in malware detection, they face the problem of concept drift due to malware constant evolution. This leads to a decline in performance over time, as the data distribution of the new files differs from the training one, requiring regular model update. In this work, we propose a model-agnostic protocol to improve a baseline neural network to handle with the drift problem. We show the importance of feature reduction and training with the most recent validation set possible, and propose a loss function named Drift-Resilient Binary Cross-Entropy, an improvement to the classical Binary Cross-Entropy more effective against drift. We train our model on the EMBER dataset (2018) and evaluate it on a dataset of recent malicious files, collected between 2020 and 2023. Our improved model shows promising results, detecting 15.2% more malware than a baseline model.

Related papers

Improving Malware Detection with Adversarial Domain Adaptation and Control Flow Graphs [10.352741619176383]
Existing solutions to combat concept drift use active learning. We propose a method that learns retained information in malware control flow graphs post-drift by leveraging graph neural network. Our approach demonstrates a significant enhancement in predicting unseen malware family in a binary classification task and predicting drifted malware families in a multi-class setting.
arXiv Detail & Related papers (2024-07-18T22:06:20Z)
Diffusion-based Neural Network Weights Generation [85.6725307453325]
We propose an efficient and adaptive transfer learning scheme through dataset-conditioned pretrained weights sampling. Specifically, we use a latent diffusion model with a variational autoencoder that can reconstruct the neural network weights.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
MORPH: Towards Automated Concept Drift Adaptation for Malware Detection [0.7499722271664147]
Concept drift is a significant challenge for malware detection. Self-training has emerged as a promising approach to mitigate concept drift. We propose MORPH -- an effective pseudo-label-based concept drift adaptation method.
arXiv Detail & Related papers (2024-01-23T14:25:43Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [67.25782152459851]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
A Reusable AI-Enabled Defect Detection System for Railway Using Ensembled CNN [5.381374943525773]
Defect detection is crucial for ensuring the trustworthiness of railway systems. Current approaches rely on single deep-learning models, like CNNs. We propose a reusable AI-enabled defect detection approach.
arXiv Detail & Related papers (2023-11-24T19:45:55Z)
New Approach to Malware Detection Using Optimized Convolutional Neural Network [0.0]
This paper proposes a new convolutional deep learning neural network to accurately and effectively detect malware with high precision. The baseline model initially achieves 98% accurate rate but after increasing the depth of the CNN model, its accuracy reaches 99.183. To further solidify the effectiveness of this CNN model, we use the improved model to make predictions on new malware samples within our dataset.
arXiv Detail & Related papers (2023-01-26T15:06:47Z)
Autoregressive based Drift Detection Method [0.0]
We propose a new concept drift detection method based on autoregressive models called ADDM. Our results show that this new concept drift detection method outperforms the state-of-the-art drift detection methods.
arXiv Detail & Related papers (2022-03-09T14:36:16Z)
Back2Future: Leveraging Backfill Dynamics for Improving Real-time Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task. 'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature. We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z)
Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn. We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z)
A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood. We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks. Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.