Flurry: a Fast Framework for Reproducible Multi-layered Provenance Graph
Representation Learning
- URL: http://arxiv.org/abs/2203.02744v1
- Date: Sat, 5 Mar 2022 13:52:11 GMT
- Title: Flurry: a Fast Framework for Reproducible Multi-layered Provenance Graph
Representation Learning
- Authors: Maya Kapoor, Joshua Melton, Michael Ridenhour, Mahalavanya Sriram,
Thomas Moyer, Siddharth Krishnan
- Abstract summary: Flurry is an end-to-end data pipeline which simulates cyberattacks.
It captures data from these attacks at multiple system and application layers, converts audit logs from these attacks into data provenance graphs, and incorporates this data with a framework for training deep neural models.
We showcase this pipeline by processing data from multiple system attacks and performing anomaly detection via graph classification.
- Score: 0.44040106718326594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Complex heterogeneous dynamic networks like knowledge graphs are powerful
constructs that can be used in modeling data provenance from computer systems.
From a security perspective, these attributed graphs enable causality analysis
and tracing for analyzing a myriad of cyberattacks. However, there is a paucity
in systematic development of pipelines that transform system executions and
provenance into usable graph representations for machine learning tasks. This
lack of instrumentation severely inhibits scientific advancement in provenance
graph machine learning by hindering reproducibility and limiting the
availability of data that are critical for techniques like graph neural
networks. To fulfill this need, we present Flurry, an end-to-end data pipeline
which simulates cyberattacks, captures provenance data from these attacks at
multiple system and application layers, converts audit logs from these attacks
into data provenance graphs, and incorporates this data with a framework for
training deep neural models that supports preconfigured or custom-designed
models for analysis in real-world resilient systems. We showcase this pipeline
by processing data from multiple system attacks and performing anomaly
detection via graph classification using current benchmark graph
representational learning frameworks. Flurry provides a fast, customizable,
extensible, and transparent solution for providing this much needed data to
cybersecurity professionals.
Related papers
- Incremental Learning with Concept Drift Detection and Prototype-based Embeddings for Graph Stream Classification [11.811637154674939]
This work introduces a novel method for graph stream classification.
It operates under the general setting where a data generating process produces graphs with varying nodes and edges over time.
It incorporates a loss-based concept drift detection mechanism to recalculate graph prototypes when drift is detected.
arXiv Detail & Related papers (2024-04-03T08:47:32Z) - GraphGuard: Detecting and Counteracting Training Data Misuse in Graph
Neural Networks [69.97213941893351]
The emergence of Graph Neural Networks (GNNs) in graph data analysis has raised critical concerns about data misuse during model training.
Existing methodologies address either data misuse detection or mitigation, and are primarily designed for local GNN models.
This paper introduces a pioneering approach called GraphGuard, to tackle these challenges.
arXiv Detail & Related papers (2023-12-13T02:59:37Z) - Stepping out of Flatland: Discovering Behavior Patterns as Topological Structures in Cyber Hypergraphs [0.7835894511242797]
We present a novel framework based in the theory of hypergraphs and topology to understand data from cyber networks.
We will demonstrate concrete examples in a large-scale cyber network dataset.
arXiv Detail & Related papers (2023-11-08T00:00:33Z) - InVAErt networks: a data-driven framework for model synthesis and
identifiability analysis [0.0]
inVAErt is a framework for data-driven analysis and synthesis of physical systems.
It uses a deterministic decoder to represent the forward and inverse maps, a normalizing flow to capture the probabilistic distribution of system outputs, and a variational encoder to learn a compact latent representation for the lack of bijectivity between inputs and outputs.
arXiv Detail & Related papers (2023-07-24T07:58:18Z) - Graph Neural Networks with Trainable Adjacency Matrices for Fault
Diagnosis on Multivariate Sensor Data [69.25738064847175]
It is necessary to consider the behavior of the signals in each sensor separately, to take into account their correlation and hidden relationships with each other.
The graph nodes can be represented as data from the different sensors, and the edges can display the influence of these data on each other.
It was proposed to construct a graph during the training of graph neural network. This allows to train models on data where the dependencies between the sensors are not known in advance.
arXiv Detail & Related papers (2022-10-20T11:03:21Z) - Learning Graph Structure from Convolutional Mixtures [119.45320143101381]
We propose a graph convolutional relationship between the observed and latent graphs, and formulate the graph learning task as a network inverse (deconvolution) problem.
In lieu of eigendecomposition-based spectral methods, we unroll and truncate proximal gradient iterations to arrive at a parameterized neural network architecture that we call a Graph Deconvolution Network (GDN)
GDNs can learn a distribution of graphs in a supervised fashion, perform link prediction or edge-weight regression tasks by adapting the loss function, and they are inherently inductive.
arXiv Detail & Related papers (2022-05-19T14:08:15Z) - A Computational Framework for Modeling Complex Sensor Network Data Using
Graph Signal Processing and Graph Neural Networks in Structural Health
Monitoring [0.7519872646378835]
We present a framework based on Complex Network Modeling, integrating Graph Signal Processing (GSP) and Graph Neural Network (GNN) approaches.
We focus on a prominent real-world structural health monitoring use case, i.e., modeling and analyzing sensor data (strain, vibration) of a large bridge in the Netherlands.
arXiv Detail & Related papers (2021-05-01T10:45:57Z) - SpikE: spike-based embeddings for multi-relational graph data [0.0]
spiking neural networks are still mostly applied to tasks stemming from sensory processing.
A rich data representation that finds wide application in industry and research is the so-called knowledge graph.
We propose a spike-based algorithm where nodes in a graph are represented by single spike times of neuron populations.
arXiv Detail & Related papers (2021-04-27T18:00:12Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Information Obfuscation of Graph Neural Networks [96.8421624921384]
We study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data.
We propose a framework to locally filter out pre-determined sensitive attributes via adversarial training with the total variation and the Wasserstein distance.
arXiv Detail & Related papers (2020-09-28T17:55:04Z) - GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training [62.73470368851127]
Graph representation learning has emerged as a powerful technique for addressing real-world problems.
We design Graph Contrastive Coding -- a self-supervised graph neural network pre-training framework.
We conduct experiments on three graph learning tasks and ten graph datasets.
arXiv Detail & Related papers (2020-06-17T16:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.