Sequence Feature Extraction for Malware Family Analysis via Graph Neural
Network
- URL: http://arxiv.org/abs/2208.05476v1
- Date: Wed, 10 Aug 2022 07:31:44 GMT
- Title: Sequence Feature Extraction for Malware Family Analysis via Graph Neural
Network
- Authors: S. W. Hsiao and P. Y. Chu
- Abstract summary: We design and implement an Attention Aware Graph Neural Network (AWGCN) to analyze the API call sequences.
Through AWGCN, we can obtain the sequence embeddings to analyze the behavior of the malware.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Malicious software (malware) causes much harm to our devices and life. We are
eager to understand the malware behavior and the threat it made. Most of the
record files of malware are variable length and text-based files with time
stamps, such as event log data and dynamic analysis profiles. Using the time
stamps, we can sort such data into sequence-based data for the following
analysis. However, dealing with the text-based sequences with variable lengths
is difficult. In addition, unlike natural language text data, most sequential
data in information security have specific properties and structure, such as
loop, repeated call, noise, etc. To deeply analyze the API call sequences with
their structure, we use graphs to represent the sequences, which can further
investigate the information and structure, such as the Markov model. Therefore,
we design and implement an Attention Aware Graph Neural Network (AWGCN) to
analyze the API call sequences. Through AWGCN, we can obtain the sequence
embeddings to analyze the behavior of the malware. Moreover, the classification
experiment result shows that AWGCN outperforms other classifiers in the
call-like datasets, and the embedding can further improve the classic model's
performance.
Related papers
- GenDFIR: Advancing Cyber Incident Timeline Analysis Through Retrieval Augmented Generation and Large Language Models [0.08192907805418582]
Cyber timeline analysis is crucial in Digital Forensics and Incident Response (DFIR)
Traditional methods rely on structured artefacts, such as logs and metadata, for evidence identification and feature extraction.
This paper introduces GenDFIR, a framework leveraging large language models (LLMs), specifically Llama 3.1 8B in zero shot mode, integrated with a Retrieval-Augmented Generation (RAG) agent.
arXiv Detail & Related papers (2024-09-04T09:46:33Z) - Prompt Engineering-assisted Malware Dynamic Analysis Using GPT-4 [45.935748395725206]
We introduce a prompt engineering-assisted malware dynamic analysis using GPT-4.
In this method, GPT-4 is employed to create explanatory text for each API call within the API sequence.
BERT is used to obtain the representation of the text, from which we derive the representation of the API sequence.
arXiv Detail & Related papers (2023-12-13T17:39:44Z) - GLAD: Content-aware Dynamic Graphs For Log Anomaly Detection [49.9884374409624]
GLAD is a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
We introduce GLAD, a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
arXiv Detail & Related papers (2023-09-12T04:21:30Z) - Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis [50.972595036856035]
We present a code that successfully replicates results from six popular and recent graph recommendation models.
We compare these graph models with traditional collaborative filtering models that historically performed well in offline evaluations.
By investigating the information flow from users' neighborhoods, we aim to identify which models are influenced by intrinsic features in the dataset structure.
arXiv Detail & Related papers (2023-08-01T09:31:44Z) - Evidential Temporal-aware Graph-based Social Event Detection via
Dempster-Shafer Theory [76.4580340399321]
We propose ETGNN, a novel Evidential Temporal-aware Graph Neural Network.
We construct view-specific graphs whose nodes are the texts and edges are determined by several types of shared elements respectively.
Considering the view-specific uncertainty, the representations of all views are converted into mass functions through evidential deep learning (EDL) neural networks.
arXiv Detail & Related papers (2022-05-24T16:22:40Z) - Multivariate Time Series Regression with Graph Neural Networks [0.6124773188525718]
Recent advances in adapting Deep Learning to graphs have shown promising potential in various graph-related tasks.
However, these methods have not been adapted for time series related tasks to a great extent.
In this work, we propose an architecture capable of processing these long sequences in a multivariate time series regression task.
arXiv Detail & Related papers (2022-01-03T16:11:46Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - Temporal Graph Network Embedding with Causal Anonymous Walks
Representations [54.05212871508062]
We propose a novel approach for dynamic network representation learning based on Temporal Graph Network.
For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings.
We show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks.
arXiv Detail & Related papers (2021-08-19T15:39:52Z) - Learning Explainable Representations of Malware Behavior [3.718942345103135]
We develop a neural network that processes network-flow data into comprehensible emphnetwork events
We then use the emphintegrated-gradients method to highlight events that jointly constitute the characteristic behavioral pattern of the threat.
We demonstrate how this system detects njRAT and other malware based on behavioral patterns.
arXiv Detail & Related papers (2021-06-23T11:50:57Z) - Time Series is a Special Sequence: Forecasting with Sample Convolution
and Interaction [9.449017120452675]
Time series is a special type of sequence data, a set of observations collected at even intervals of time and ordered chronologically.
Existing deep learning techniques use generic sequence models for time series analysis, which ignore some of its unique properties.
We propose a novel neural network architecture and apply it for the time series forecasting problem, wherein we conduct sample convolution and interaction at multiple resolutions for temporal modeling.
arXiv Detail & Related papers (2021-06-17T08:15:04Z) - Structural Temporal Graph Neural Networks for Anomaly Detection in
Dynamic Graphs [54.13919050090926]
We propose an end-to-end structural temporal Graph Neural Network model for detecting anomalous edges in dynamic graphs.
In particular, we first extract the $h$-hop enclosing subgraph centered on the target edge and propose the node labeling function to identify the role of each node in the subgraph.
Based on the extracted features, we utilize Gated recurrent units (GRUs) to capture the temporal information for anomaly detection.
arXiv Detail & Related papers (2020-05-15T09:17:08Z) - A Graph-Based Platform for Customer Behavior Analysis using
Applications' Clickstream Data [0.0]
Clickstream data can be considered as a sequence of log events collected at different levels of web/app usage.
We show how representing and saving the sequences with their underlying graph structures can induce a platform for customer behavior analysis.
arXiv Detail & Related papers (2020-02-20T13:57:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.