Demystifying the Role of Rule-based Detection in AI Systems for Windows Malware Detection
- URL: http://arxiv.org/abs/2508.09652v1
- Date: Wed, 13 Aug 2025 09:35:51 GMT
- Title: Demystifying the Role of Rule-based Detection in AI Systems for Windows Malware Detection
- Authors: Andrea Ponte, Luca Demetrio, Luca Oneto, Ivan Tesfai Ogbu, Battista Biggio, Fabio Roli,
- Abstract summary: Malware detection increasingly relies on AI systems that integrate signature-based detection with machine learning.<n>We investigate the influence that signature-based detection exerts on model training, when they are included inside the training pipeline.
- Score: 12.318835339832056
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Malware detection increasingly relies on AI systems that integrate signature-based detection with machine learning. However, these components are typically developed and combined in isolation, missing opportunities to reduce data complexity and strengthen defenses against adversarial EXEmples, carefully crafted programs designed to evade detection. Hence, in this work we investigate the influence that signature-based detection exerts on model training, when they are included inside the training pipeline. Specifically, we compare models trained on a comprehensive dataset with an AI system whose machine learning component is trained solely on samples not already flagged by signatures. Our results demonstrate improved robustness to both adversarial EXEmples and temporal data drift, although this comes at the cost of a fixed lower bound on false positives, driven by suboptimal rule selection. We conclude by discussing these limitations and outlining how future research could extend AI-based malware detection to include dynamic analysis, thereby further enhancing system resilience.
Related papers
- Semantic-Aware Advanced Persistent Threat Detection Using Autoencoders on LLM-Encoded System Logs [0.7611870296994722]
Advanced Persistent Threats (APTs) are among the most challenging cyberattacks to detect.<n>Traditional statistical methods and shallow machine learning techniques often fail to detect them.<n>This paper proposes a novel anomaly detection approach that leverages semantic embeddings.
arXiv Detail & Related papers (2026-01-30T12:38:12Z) - Towards Robust Artificial Intelligence: Self-Supervised Learning Approach for Out-of-Distribution Detection [0.19599274203282294]
This paper proposes an approach to improve OOD detection without the need of labeled data.<n>The proposed approach leverages the principles of self-supervised learning, allowing the model to learn useful representations from unlabeled data.
arXiv Detail & Related papers (2025-10-14T16:55:25Z) - SLIFER: Investigating Performance and Robustness of Malware Detection Pipelines [12.940071285118451]
academia focuses on combining static and dynamic analysis within a single or ensemble of models.<n>In this paper, we investigate the properties of malware detectors built with multiple and different types of analysis.<n>As far as we know, we are the first to investigate the properties of sequential malware detectors, shedding light on their behavior in real production environment.
arXiv Detail & Related papers (2024-05-23T12:06:10Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - Certified Control for Train Sign Classification [0.0]
The KI-LOK research project is involved in developing new methods for certifying such AI-based systems.
Here we explore the utility of a certified control architecture for a runtime monitor that prevents false positive detection of traffic signs.
arXiv Detail & Related papers (2023-11-16T11:02:10Z) - Few-shot Weakly-supervised Cybersecurity Anomaly Detection [1.179179628317559]
We propose an enhancement to an existing few-shot weakly-supervised deep learning anomaly detection framework.
This framework incorporates data augmentation, representation learning and ordinal regression.
We then evaluated and showed the performance of our implemented framework on three benchmark datasets.
arXiv Detail & Related papers (2023-04-15T04:37:54Z) - PAC-Based Formal Verification for Out-of-Distribution Data Detection [4.406331747636832]
This study places probably approximately correct (PAC) based guarantees on OOD detection using the encoding process within VAEs.
It is used to bound the detection error on unfamiliar instances with user-defined confidence.
arXiv Detail & Related papers (2023-04-04T07:33:02Z) - TAD: Transfer Learning-based Multi-Adversarial Detection of Evasion
Attacks against Network Intrusion Detection Systems [0.7829352305480285]
We implement existing state-of-the-art models for intrusion detection.
We then attack those models with a set of chosen evasion attacks.
In an attempt to detect those adversarial attacks, we design and implement multiple transfer learning-based adversarial detectors.
arXiv Detail & Related papers (2022-10-27T18:02:58Z) - Neurosymbolic hybrid approach to driver collision warning [64.02492460600905]
There are two main algorithmic approaches to autonomous driving systems.
Deep learning alone has achieved state-of-the-art results in many areas.
But sometimes it can be very difficult to debug if the deep learning model doesn't work.
arXiv Detail & Related papers (2022-03-28T20:29:50Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - No Need to Know Physics: Resilience of Process-based Model-free Anomaly
Detection for Industrial Control Systems [95.54151664013011]
We present a novel framework to generate adversarial spoofing signals that violate physical properties of the system.
We analyze four anomaly detectors published at top security conferences.
arXiv Detail & Related papers (2020-12-07T11:02:44Z) - A Novel Anomaly Detection Algorithm for Hybrid Production Systems based
on Deep Learning and Timed Automata [73.38551379469533]
DAD:DeepAnomalyDetection is a new approach for automatic model learning and anomaly detection in hybrid production systems.
It combines deep learning and timed automata for creating behavioral model from observations.
The algorithm has been applied to few data sets including two from real systems and has shown promising results.
arXiv Detail & Related papers (2020-10-29T08:27:43Z) - Adversarial vs behavioural-based defensive AI with joint, continual and
active learning: automated evaluation of robustness to deception, poisoning
and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security.
In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.