Implicit Neural Multiple Description for DNA-based data storage
- URL: http://arxiv.org/abs/2309.06956v1
- Date: Wed, 13 Sep 2023 13:42:52 GMT
- Title: Implicit Neural Multiple Description for DNA-based data storage
- Authors: Trung Hieu Le, Xavier Pic, Jeremy Mateos and Marc Antonini
- Abstract summary: DNA exhibits remarkable potential as a data storage solution due to its impressive storage density and long-term stability.
However, developing this novel medium comes with its own set of challenges, particularly in addressing errors arising from storage and biological manipulations.
We have pioneered a novel compression scheme and a cutting-edge Multiple Description Coding (MDC) technique utilizing neural networks for DNA data storage.
- Score: 6.423239719448169
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: DNA exhibits remarkable potential as a data storage solution due to its
impressive storage density and long-term stability, stemming from its inherent
biomolecular structure. However, developing this novel medium comes with its
own set of challenges, particularly in addressing errors arising from storage
and biological manipulations. These challenges are further conditioned by the
structural constraints of DNA sequences and cost considerations. In response to
these limitations, we have pioneered a novel compression scheme and a
cutting-edge Multiple Description Coding (MDC) technique utilizing neural
networks for DNA data storage. Our MDC method introduces an innovative approach
to encoding data into DNA, specifically designed to withstand errors
effectively. Notably, our new compression scheme overperforms classic image
compression methods for DNA-data storage. Furthermore, our approach exhibits
superiority over conventional MDC methods reliant on auto-encoders. Its
distinctive strengths lie in its ability to bypass the need for extensive model
training and its enhanced adaptability for fine-tuning redundancy levels.
Experimental results demonstrate that our solution competes favorably with the
latest DNA data storage methods in the field, offering superior compression
rates and robust noise resilience.
Related papers
- Unlocking Potential Binders: Multimodal Pretraining DEL-Fusion for Denoising DNA-Encoded Libraries [51.72836644350993]
Multimodal Pretraining DEL-Fusion model (MPDF)
We develop pretraining tasks applying contrastive objectives between different compound representations and their text descriptions.
We propose a novel DEL-fusion framework that amalgamates compound information at the atomic, submolecular, and molecular levels.
arXiv Detail & Related papers (2024-09-07T17:32:21Z) - ADRS-CNet: An adaptive dimensionality reduction selection and classification network for DNA storage clustering algorithms [8.295062627879938]
Methods like PCA, UMAP, and t-SNE are commonly employed to project high-dimensional features into low-dimensional space.
This paper proposes training a multilayer perceptron model to classify input DNA sequence features and adaptively select the most suitable dimensionality reduction method.
arXiv Detail & Related papers (2024-08-22T22:26:41Z) - Efficient Automation of Neural Network Design: A Survey on
Differentiable Neural Architecture Search [70.31239620427526]
Differentiable Neural Architecture Search (DNAS) rapidly imposed itself as the trending approach to automate the discovery of deep neural network architectures.
This rise is mainly due to the popularity of DARTS, one of the first major DNAS methods.
In this comprehensive survey, we focus specifically on DNAS and review recent approaches in this field.
arXiv Detail & Related papers (2023-04-11T13:15:29Z) - Image Storage on Synthetic DNA Using Autoencoders [6.096779295981377]
This paper presents some results on lossy image compression methods based on convolutional autoencoders adapted to DNA data storage.
The model architectures presented here have been designed to efficiently compress images, encode them into a quaternary code, and finally store them into synthetic DNA molecules.
arXiv Detail & Related papers (2022-03-18T14:17:48Z) - Single-Read Reconstruction for DNA Data Storage Using Transformers [0.0]
We propose a novel approach for single-read reconstruction using an encoder-decoder Transformer architecture for DNA based data storage.
Our model achieves lower error rates when reconstructing the original data from a single read of each DNA strand.
This is the first demonstration of using deep learning models for single-read reconstruction in DNA based storage.
arXiv Detail & Related papers (2021-09-12T10:01:59Z) - Deep DNA Storage: Scalable and Robust DNA Storage via Coding Theory and
Deep Learning [49.3231734733112]
We show a modular and holistic approach that combines Deep Neural Networks (DNN) trained on simulated data, Product (TP) based Error-Correcting Codes (ECC) and a safety margin into a single coherent pipeline.
Our work improves upon the current leading solutions by up to x3200 increase in speed, 40% improvement in accuracy, and offers a code rate of 1.6 bits per base in a high noise regime.
arXiv Detail & Related papers (2021-08-31T18:21:20Z) - Brain Image Synthesis with Unsupervised Multivariate Canonical
CSC$\ell_4$Net [122.8907826672382]
We propose to learn dedicated features that cross both intre- and intra-modal variations using a novel CSC$ell_4$Net.
arXiv Detail & Related papers (2021-03-22T05:19:40Z) - Efficient approximation of DNA hybridisation using deep learning [0.0]
We present the first comprehensive study of machine learning methods applied to the task of predicting DNA hybridisation.
We introduce a synthetic hybridisation dataset of over 2.5 million data points, enabling the use of a wide range of machine learning algorithms.
arXiv Detail & Related papers (2021-02-19T19:23:49Z) - Neural Network Compression for Noisy Storage Devices [71.4102472611862]
Conventionally, model compression and physical storage are decoupled.
This approach forces the storage to treat each bit of the compressed model equally, and to dedicate the same amount of resources to each bit.
We propose a radically different approach that: (i) employs analog memories to maximize the capacity of each memory cell, and (ii) jointly optimize model compression and physical storage to maximize memory utility.
arXiv Detail & Related papers (2021-02-15T18:19:07Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.