Related papers: Living off the Analyst: Harvesting Features from Yara Rules for Malware Detection

Living off the Analyst: Harvesting Features from Yara Rules for Malware Detection

URL: http://arxiv.org/abs/2411.18516v1
Date: Wed, 27 Nov 2024 17:03:00 GMT
Title: Living off the Analyst: Harvesting Features from Yara Rules for Malware Detection
Authors: Siddhant Gupta, Fred Lu, Andrew Barlow, Edward Raff, Francis Ferraro, Cynthia Matuszek, Charles Nicholas, James Holt,
Abstract summary: A strategy used by malicious actors is to "live off the land," where benign systems are used and repurposed for the malicious actor's intent.<n>We show that this is plausible via YARA rules, which use human-written signatures to detect specific malware families.<n>By extracting sub-signatures from publicly available YARA rules, we assembled a set of features that can more effectively discriminate malicious samples.
Score: 50.55317257140427
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A strategy used by malicious actors is to "live off the land," where benign systems and tools already available on a victim's systems are used and repurposed for the malicious actor's intent. In this work, we ask if there is a way for anti-virus developers to similarly re-purpose existing work to improve their malware detection capability. We show that this is plausible via YARA rules, which use human-written signatures to detect specific malware families, functionalities, or other markers of interest. By extracting sub-signatures from publicly available YARA rules, we assembled a set of features that can more effectively discriminate malicious samples from benign ones. Our experiments demonstrate that these features add value beyond traditional features on the EMBER 2018 dataset. Manual analysis of the added sub-signatures shows a power-law behavior in a combination of features that are specific and unique, as well as features that occur often. A prior expectation may be that the features would be limited in being overly specific to unique malware families. This behavior is observed, and is apparently useful in practice. In addition, we also find sub-signatures that are dual-purpose (e.g., detecting virtual machine environments) or broadly generic (e.g., DLL imports).

Related papers

Malware Detection based on API calls [0.48866322421122627]
We explore a lightweight, order-invariant approach to detecting and mitigating malware threats. We publish a public dataset of over three hundred thousand samples, annotated with labels indicating benign or malicious activity. We leverage machine learning algorithms, such as random forests, and conduct behavioral analysis by examining patterns and anomalies in API call sequences.
arXiv Detail & Related papers (2025-02-18T13:51:56Z)
Adversarial Suffixes May Be Features Too! [10.463762448166714]
We show that adversarial suffixes generated from jailbreak attacks may contain meaningful features. This highlights the critical risk posed by dominating benign features in the training data.
arXiv Detail & Related papers (2024-10-01T07:11:55Z)
MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware. We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph. This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z)
Semantic Data Representation for Explainable Windows Malware Detection Models [0.0]
We propose PE Malware Ontology that offers a reusable semantic schema for Portable Executable (PE - the Windows binary format) malware files. This ontology is inspired by the structure of the EMBER dataset, which focuses on the static malware analysis of PE files. We also publish semantically treated EMBER data, including fractional datasets, to support experiments on EMBER.
arXiv Detail & Related papers (2024-03-18T11:17:27Z)
Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model. We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior. Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z)
MDENet: Multi-modal Dual-embedding Networks for Malware Open-set Recognition [17.027132477210092]
We propose the Multi-modal Dual-Embedding Networks, dubbed MDENet, to take advantage of comprehensive malware features. We also enrich our previously proposed large-scaled malware dataset MAL-100 with multi-modal characteristics.
arXiv Detail & Related papers (2023-05-02T08:09:51Z)
Explainable Abuse Detection as Intent Classification and Slot Filling [66.80201541759409]
We introduce the concept of policy-aware abuse detection, abandoning the unrealistic expectation that systems can reliably learn which phenomena constitute abuse from inspecting the data alone. We show how architectures for intent classification and slot filling can be used for abuse detection, while providing a rationale for model decisions.
arXiv Detail & Related papers (2022-10-06T03:33:30Z)
Mate! Are You Really Aware? An Explainability-Guided Testing Framework for Robustness of Malware Detectors [49.34155921877441]
We propose an explainability-guided and model-agnostic testing framework for robustness of malware detectors. We then use this framework to test several state-of-the-art malware detectors' abilities to detect manipulated malware. Our findings shed light on the limitations of current malware detectors, as well as how they can be improved.
arXiv Detail & Related papers (2021-11-19T08:02:38Z)
Evaluation of an Anomaly Detector for Routers using Parameterizable Malware in an IoT Ecosystem [3.495114525631289]
This IoT Ecosystem was developed as a testbed to evaluate the efficacy of a behavior-based anomaly detector. The malware consists of three types of custom-made malware: ransomware, cryptominer, and keylogger. The anomaly detector uses feature sets crafted from system calls and network traffic, and uses a Support Vector Machine for behavioral-based anomaly detection.
arXiv Detail & Related papers (2021-10-29T21:57:54Z)
Being Single Has Benefits. Instance Poisoning to Deceive Malware Classifiers [47.828297621738265]
We show how an attacker can launch a sophisticated and efficient poisoning attack targeting the dataset used to train a malware classifier. As opposed to other poisoning attacks in the malware detection domain, our attack does not focus on malware families but rather on specific malware instances that contain an implanted trigger. We propose a comprehensive detection approach that could serve as a future sophisticated defense against this newly discovered severe threat.
arXiv Detail & Related papers (2020-10-30T15:27:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.