Related papers: Efficient Concept Drift Handling for Batch Android Malware Detection Models

Efficient Concept Drift Handling for Batch Android Malware Detection Models

URL: http://arxiv.org/abs/2309.09807v1
Date: Mon, 18 Sep 2023 14:28:18 GMT
Title: Efficient Concept Drift Handling for Batch Android Malware Detection Models
Authors: Molina-Coronado B., Mori U., Mendiburu A., Miguel-Alonso J
Abstract summary: We show how retraining techniques are able to maintain detector capabilities over time. Our experiments show that concept drift detection and sample selection mechanisms result in very efficient retraining strategies.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapidly evolving nature of Android apps poses a significant challenge to static batch machine learning algorithms employed in malware detection systems, as they quickly become obsolete. Despite this challenge, the existing literature pays limited attention to addressing this issue, with many advanced Android malware detection approaches, such as Drebin, DroidDet and MaMaDroid, relying on static models. In this work, we show how retraining techniques are able to maintain detector capabilities over time. Particularly, we analyze the effect of two aspects in the efficiency and performance of the detectors: 1) the frequency with which the models are retrained, and 2) the data used for retraining. In the first experiment, we compare periodic retraining with a more advanced concept drift detection method that triggers retraining only when necessary. In the second experiment, we analyze sampling methods to reduce the amount of data used to retrain models. Specifically, we compare fixed sized windows of recent data and state-of-the-art active learning methods that select those apps that help keep the training dataset small but diverse. Our experiments show that concept drift detection and sample selection mechanisms result in very efficient retraining strategies which can be successfully used to maintain the performance of the static Android malware state-of-the-art detectors in changing environments.

Related papers

Cluster Analysis and Concept Drift Detection in Malware [1.3812010983144798]
Concept drift refers to gradual or sudden changes in the properties of data that affect the accuracy of machine learning models. We propose and analyze a clustering-based approach to detecting concept drift in the malware domain.
arXiv Detail & Related papers (2025-02-19T22:42:30Z)
Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task. We name our approach Adaptive Retention & Correction (ARC) ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z)
Unsupervised Domain Adaptation for Self-Driving from Past Traversal Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments. Our approach enhances LiDAR-based detection models using spatial quantized historical features. Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z)
Continuous Learning for Android Malware Detection [15.818435778629635]
We propose a new hierarchical contrastive learning scheme, and a new sample selection technique to continuously train the Android malware classifier. Our approach reduces the false negative rate from 14% (for the best baseline) to 9%, while also reducing the false positive rate (from 0.86% to 0.48%).
arXiv Detail & Related papers (2023-02-08T20:54:11Z)
A Data-Centric Approach for Improving Adversarial Training Through the Lens of Out-of-Distribution Detection [0.4893345190925178]
We propose detecting and removing hard samples directly from the training procedure rather than applying complicated algorithms to mitigate their effects. Our results on SVHN and CIFAR-10 datasets show the effectiveness of this method in improving the adversarial training without adding too much computational cost.
arXiv Detail & Related papers (2023-01-25T08:13:50Z)
Multi-dataset Training of Transformers for Robust Action Recognition [75.5695991766902]
We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition. Here, we propose a novel multi-dataset training paradigm, MultiTrain, with the design of two new loss terms, namely informative loss and projection loss. We verify the effectiveness of our method on five challenging datasets, Kinetics-400, Kinetics-700, Moments-in-Time, Activitynet and Something-something-v2.
arXiv Detail & Related papers (2022-09-26T01:30:43Z)
Incremental Online Learning Algorithms Comparison for Gesture and Visual Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification. Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z)
Self-supervised Transformer for Deepfake Detection [112.81127845409002]
Deepfake techniques in real-world scenarios require stronger generalization abilities of face forgery detectors. Inspired by transfer learning, neural networks pre-trained on other large-scale face-related tasks may provide useful features for deepfake detection. In this paper, we propose a self-supervised transformer based audio-visual contrastive learning method.
arXiv Detail & Related papers (2022-03-02T17:44:40Z)
Improving Variational Autoencoder based Out-of-Distribution Detection for Embedded Real-time Applications [2.9327503320877457]
Out-of-distribution (OD) detection is an emerging approach to address the challenge of detecting out-of-distribution in real-time. In this paper, we show how we can robustly detect hazardous motion around autonomous driving agents. Our methods significantly improve detection capabilities of OoD factors to unique driving scenarios, 42% better than state-of-the-art approaches. Our model also generalized near-perfectly, 97% better than the state-of-the-art across the real-world and simulation driving data sets experimented.
arXiv Detail & Related papers (2021-07-25T07:52:53Z)
Improving Botnet Detection with Recurrent Neural Network and Transfer Learning [5.602292536933117]
Botnet detection is a critical step in stopping the spread of botnets and preventing malicious activities. Recent approaches employing machine learning (ML) showed improved performance than earlier ones. We propose a novel botnet detection method, built upon Recurrent Variational Autoencoder (RVAE)
arXiv Detail & Related papers (2021-04-26T14:05:01Z)
One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module. We also propose novel training strategies that effectively improve detection performance. Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
An Efficient Method of Training Small Models for Regression Problems with Knowledge Distillation [1.433758865948252]
We propose a new formalism of knowledge distillation for regression problems. First, we propose a new loss function, teacher outlier loss rejection, which rejects outliers in training samples using teacher model predictions. By considering the multi-task network, training of the feature extraction of student models becomes more effective.
arXiv Detail & Related papers (2020-02-28T08:46:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.