Efficient Concept Drift Handling for Batch Android Malware Detection
Models
- URL: http://arxiv.org/abs/2309.09807v1
- Date: Mon, 18 Sep 2023 14:28:18 GMT
- Title: Efficient Concept Drift Handling for Batch Android Malware Detection
Models
- Authors: Molina-Coronado B., Mori U., Mendiburu A., Miguel-Alonso J
- Abstract summary: We show how retraining techniques are able to maintain detector capabilities over time.
Our experiments show that concept drift detection and sample selection mechanisms result in very efficient retraining strategies.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapidly evolving nature of Android apps poses a significant challenge to
static batch machine learning algorithms employed in malware detection systems,
as they quickly become obsolete. Despite this challenge, the existing
literature pays limited attention to addressing this issue, with many advanced
Android malware detection approaches, such as Drebin, DroidDet and MaMaDroid,
relying on static models. In this work, we show how retraining techniques are
able to maintain detector capabilities over time. Particularly, we analyze the
effect of two aspects in the efficiency and performance of the detectors: 1)
the frequency with which the models are retrained, and 2) the data used for
retraining. In the first experiment, we compare periodic retraining with a more
advanced concept drift detection method that triggers retraining only when
necessary. In the second experiment, we analyze sampling methods to reduce the
amount of data used to retrain models. Specifically, we compare fixed sized
windows of recent data and state-of-the-art active learning methods that select
those apps that help keep the training dataset small but diverse. Our
experiments show that concept drift detection and sample selection mechanisms
result in very efficient retraining strategies which can be successfully used
to maintain the performance of the static Android malware state-of-the-art
detectors in changing environments.
Related papers
- Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Continuous Learning for Android Malware Detection [15.818435778629635]
We propose a new hierarchical contrastive learning scheme, and a new sample selection technique to continuously train the Android malware classifier.
Our approach reduces the false negative rate from 14% (for the best baseline) to 9%, while also reducing the false positive rate (from 0.86% to 0.48%).
arXiv Detail & Related papers (2023-02-08T20:54:11Z) - A Data-Centric Approach for Improving Adversarial Training Through the
Lens of Out-of-Distribution Detection [0.4893345190925178]
We propose detecting and removing hard samples directly from the training procedure rather than applying complicated algorithms to mitigate their effects.
Our results on SVHN and CIFAR-10 datasets show the effectiveness of this method in improving the adversarial training without adding too much computational cost.
arXiv Detail & Related papers (2023-01-25T08:13:50Z) - Multi-dataset Training of Transformers for Robust Action Recognition [75.5695991766902]
We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition.
Here, we propose a novel multi-dataset training paradigm, MultiTrain, with the design of two new loss terms, namely informative loss and projection loss.
We verify the effectiveness of our method on five challenging datasets, Kinetics-400, Kinetics-700, Moments-in-Time, Activitynet and Something-something-v2.
arXiv Detail & Related papers (2022-09-26T01:30:43Z) - Incremental Online Learning Algorithms Comparison for Gesture and Visual
Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification.
Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z) - Self-supervised Transformer for Deepfake Detection [112.81127845409002]
Deepfake techniques in real-world scenarios require stronger generalization abilities of face forgery detectors.
Inspired by transfer learning, neural networks pre-trained on other large-scale face-related tasks may provide useful features for deepfake detection.
In this paper, we propose a self-supervised transformer based audio-visual contrastive learning method.
arXiv Detail & Related papers (2022-03-02T17:44:40Z) - Improving Variational Autoencoder based Out-of-Distribution Detection
for Embedded Real-time Applications [2.9327503320877457]
Out-of-distribution (OD) detection is an emerging approach to address the challenge of detecting out-of-distribution in real-time.
In this paper, we show how we can robustly detect hazardous motion around autonomous driving agents.
Our methods significantly improve detection capabilities of OoD factors to unique driving scenarios, 42% better than state-of-the-art approaches.
Our model also generalized near-perfectly, 97% better than the state-of-the-art across the real-world and simulation driving data sets experimented.
arXiv Detail & Related papers (2021-07-25T07:52:53Z) - Improving Botnet Detection with Recurrent Neural Network and Transfer
Learning [5.602292536933117]
Botnet detection is a critical step in stopping the spread of botnets and preventing malicious activities.
Recent approaches employing machine learning (ML) showed improved performance than earlier ones.
We propose a novel botnet detection method, built upon Recurrent Variational Autoencoder (RVAE)
arXiv Detail & Related papers (2021-04-26T14:05:01Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - An Efficient Method of Training Small Models for Regression Problems
with Knowledge Distillation [1.433758865948252]
We propose a new formalism of knowledge distillation for regression problems.
First, we propose a new loss function, teacher outlier loss rejection, which rejects outliers in training samples using teacher model predictions.
By considering the multi-task network, training of the feature extraction of student models becomes more effective.
arXiv Detail & Related papers (2020-02-28T08:46:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.