Predicting Process Name from Network Data
- URL: http://arxiv.org/abs/2109.03328v1
- Date: Fri, 3 Sep 2021 20:15:34 GMT
- Title: Predicting Process Name from Network Data
- Authors: Justin Allen, David Knapp, Kristine Monteith
- Abstract summary: We report on a machine learning technique capable of using netflow-like features to predict the application that generated the traffic.
In our experiments, we used ground-truth labels obtained from host-based sensors deployed in a large enterprise environment.
We demonstrate how machine learning models can achieve high classification accuracy using only netflow-like features as the basis for classification.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to identify applications based on the network data they generate
could be a valuable tool for cyber defense. We report on a machine learning
technique capable of using netflow-like features to predict the application
that generated the traffic. In our experiments, we used ground-truth labels
obtained from host-based sensors deployed in a large enterprise environment; we
applied random forests and multilayer perceptrons to the tasks of browser vs.
non-browser identification, browser fingerprinting, and process name
prediction. For each of these tasks, we demonstrate how machine learning models
can achieve high classification accuracy using only netflow-like features as
the basis for classification.
Related papers
- Unveiling the Digital Fingerprints: Analysis of Internet attacks based on website fingerprints [0.0]
We show that using the newest machine learning algorithms an attacker can deanonymize Tor traffic by applying such techniques.
We capture network packets across 11 days, while users navigate specific web pages, recording data in.pcapng format through the Wireshark network capture tool.
arXiv Detail & Related papers (2024-09-01T18:44:40Z) - Locality Sensitive Hashing for Network Traffic Fingerprinting [5.062312533373298]
We use locality-sensitive hashing (LSH) for network traffic fingerprinting.
Our method increases the accuracy of state-of-the-art by 12% achieving around 94% accuracy in identifying devices in a network.
arXiv Detail & Related papers (2024-02-12T21:14:37Z) - GROWN+UP: A Graph Representation Of a Webpage Network Utilizing
Pre-training [0.2538209532048866]
We introduce an agnostic deep graph neural network feature extractor that can ingest webpage structures, pre-train self-supervised on massive unlabeled data, and fine-tune to arbitrary tasks on webpages effectually.
We show that our pre-trained model achieves state-of-the-art results using multiple datasets on two very different benchmarks: webpage boilerplate removal and genre classification.
arXiv Detail & Related papers (2022-08-03T13:37:27Z) - AutoGeoLabel: Automated Label Generation for Geospatial Machine Learning [69.47585818994959]
We evaluate a big data processing pipeline to auto-generate labels for remote sensing data.
We utilize the big geo-data platform IBM PAIRS to dynamically generate such labels in dense urban areas.
arXiv Detail & Related papers (2022-01-31T20:02:22Z) - Temporal Graph Network Embedding with Causal Anonymous Walks
Representations [54.05212871508062]
We propose a novel approach for dynamic network representation learning based on Temporal Graph Network.
For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings.
We show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks.
arXiv Detail & Related papers (2021-08-19T15:39:52Z) - Deep Learning for Network Traffic Classification [0.0]
Monitoring network traffic to identify content, services, and applications is an active research topic in network traffic control systems.
Previous work has identified machine learning methods that may enable application and service identification.
We propose a classification technique using an ensemble of deep learning architectures on packet, payload, and inter-arrival time sequences.
arXiv Detail & Related papers (2021-06-02T04:11:32Z) - Forensicability of Deep Neural Network Inference Pipelines [68.8204255655161]
We propose methods to infer properties of the execution environment of machine learning pipelines by tracing characteristic numerical deviations in observable outputs.
Results from a series of proof-of-concept experiments give rise to possible forensic applications, such as the identification of the hardware platform used to produce deep neural network predictions.
arXiv Detail & Related papers (2021-02-01T15:41:49Z) - Feature Extraction for Novelty Detection in Network Traffic [18.687465197576415]
Data representation plays a critical role in the performance of novelty detection methods in machine learning.
We release an open-source tool, an accompanying Python library, and an end-to-end pipeline for novelty detection in network traffic.
arXiv Detail & Related papers (2020-06-30T17:53:59Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z) - PyODDS: An End-to-end Outlier Detection System with Automated Machine
Learning [55.32009000204512]
We present PyODDS, an automated end-to-end Python system for Outlier Detection with Database Support.
Specifically, we define the search space in the outlier detection pipeline, and produce a search strategy within the given search space.
It also provides unified interfaces and visualizations for users with or without data science or machine learning background.
arXiv Detail & Related papers (2020-03-12T03:30:30Z) - Key Points Estimation and Point Instance Segmentation Approach for Lane
Detection [65.37887088194022]
We propose a traffic line detection method called Point Instance Network (PINet)
The PINet includes several stacked hourglass networks that are trained simultaneously.
The PINet achieves competitive accuracy and false positive on the TuSimple and Culane datasets.
arXiv Detail & Related papers (2020-02-16T15:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.