A Machine Learning-based Approach to Detect Threats in Bio-Cyber DNA
Storage Systems
- URL: http://arxiv.org/abs/2009.13380v1
- Date: Mon, 28 Sep 2020 14:55:20 GMT
- Title: A Machine Learning-based Approach to Detect Threats in Bio-Cyber DNA
Storage Systems
- Authors: Federico Tavella, Alberto Giaretta, Mauro Conti, Sasitharan
Balasubramaniam
- Abstract summary: We have proposed an automated archival architecture which uses bioengineered bacteria to store and retrieve data, previously encoded into DNA.
The similarities between these biological media and classical ones can also be a drawback, as malicious parties might replicate traditional attacks on the former archival system.
In this paper, first we analyse the main characteristics of our storage system and the different types of attacks that could be executed on it.
Then, aiming at identifying on-going attacks, we propose and evaluate detection techniques, which rely on traditional metrics and machine learning algorithms.
- Score: 20.27498894606937
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data storage is one of the main computing issues of this century. Not only
storage devices are converging to strict physical limits, but also the amount
of data generated by users is growing at an unbelievable rate. To face these
challenges, data centres grew constantly over the past decades. However, this
growth comes with a price, particularly from the environmental point of view.
Among various promising media, DNA is one of the most fascinating candidate. In
our previous work, we have proposed an automated archival architecture which
uses bioengineered bacteria to store and retrieve data, previously encoded into
DNA. This storage technique is one example of how biological media can deliver
power-efficient storing solutions. The similarities between these biological
media and classical ones can also be a drawback, as malicious parties might
replicate traditional attacks on the former archival system, using biological
instruments and techniques. In this paper, first we analyse the main
characteristics of our storage system and the different types of attacks that
could be executed on it. Then, aiming at identifying on-going attacks, we
propose and evaluate detection techniques, which rely on traditional metrics
and machine learning algorithms. We identify and adapt two suitable metrics for
this purpose, namely generalized entropy and information distance. Moreover,
our trained models achieve an AUROC over 0.99 and AUPRC over 0.91.
Related papers
- An advanced data fabric architecture leveraging homomorphic encryption
and federated learning [10.779491433438144]
This paper introduces a secure approach for medical image analysis using federated learning and partially homomorphic encryption within a distributed data fabric architecture.
The study demonstrates the method's effectiveness through a case study on pituitary tumor classification, achieving a significant level of accuracy.
arXiv Detail & Related papers (2024-02-15T08:50:36Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - How Does Generative Retrieval Scale to Millions of Passages? [68.98628807288972]
We conduct the first empirical study of generative retrieval techniques across various corpus scales.
We scale generative retrieval to millions of passages with a corpus of 8.8M passages and evaluating model sizes up to 11B parameters.
While generative retrieval is competitive with state-of-the-art dual encoders on small corpora, scaling to millions of passages remains an important and unsolved challenge.
arXiv Detail & Related papers (2023-05-19T17:33:38Z) - Deep metric learning improves lab of origin prediction of genetically
engineered plasmids [63.05016513788047]
Genetic engineering attribution (GEA) is the ability to make sequence-lab associations.
We propose a method, based on metric learning, that ranks the most likely labs-of-origin.
We are able to extract key signatures in plasmid sequences for particular labs, allowing for an interpretable examination of the model's outputs.
arXiv Detail & Related papers (2021-11-24T16:29:03Z) - Human-in-the-Loop Disinformation Detection: Stance, Sentiment, or
Something Else? [93.91375268580806]
Both politics and pandemics have recently provided ample motivation for the development of machine learning-enabled disinformation (a.k.a. fake news) detection algorithms.
Existing literature has focused primarily on the fully-automated case, but the resulting techniques cannot reliably detect disinformation on the varied topics, sources, and time scales required for military applications.
By leveraging an already-available analyst as a human-in-the-loop, canonical machine learning techniques of sentiment analysis, aspect-based sentiment analysis, and stance detection become plausible methods to use for a partially-automated disinformation detection system.
arXiv Detail & Related papers (2021-11-09T13:30:34Z) - Deep DNA Storage: Scalable and Robust DNA Storage via Coding Theory and
Deep Learning [49.3231734733112]
We show a modular and holistic approach that combines Deep Neural Networks (DNN) trained on simulated data, Product (TP) based Error-Correcting Codes (ECC) and a safety margin into a single coherent pipeline.
Our work improves upon the current leading solutions by up to x3200 increase in speed, 40% improvement in accuracy, and offers a code rate of 1.6 bits per base in a high noise regime.
arXiv Detail & Related papers (2021-08-31T18:21:20Z) - TELESTO: A Graph Neural Network Model for Anomaly Classification in
Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance.
One direction aims at the recognition of re-occurring anomaly types to enable remediation automation.
We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z) - A Compact Deep Learning Model for Face Spoofing Detection [4.250231861415827]
presentation attack detection (PAD) has received significant attention from research communities.
We address the problem via fusing both wide and deep features in a unified neural architecture.
The procedure is done on different spoofing datasets such as ROSE-Youtu, SiW and NUAA Imposter.
arXiv Detail & Related papers (2021-01-12T21:20:09Z) - Deep Learning based Covert Attack Identification for Industrial Control
Systems [5.299113288020827]
We develop a data-driven framework that can be used to detect, diagnose, and localize a type of cyberattack called covert attacks on smart grids.
The framework has a hybrid design that combines an autoencoder, a recurrent neural network (RNN) with a Long-Short-Term-Memory layer, and a Deep Neural Network (DNN)
arXiv Detail & Related papers (2020-09-25T17:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.