Apply Distributed CNN on Genomics to accelerate Transcription-Factor TAL1 Motif Prediction
- URL: http://arxiv.org/abs/2405.16097v1
- Date: Sat, 25 May 2024 07:09:44 GMT
- Title: Apply Distributed CNN on Genomics to accelerate Transcription-Factor TAL1 Motif Prediction
- Authors: Tasnim Assali, Zayneb Trabelsi Ayoub, Sofiane Ouni,
- Abstract summary: We highlight the potential of deep learning in the field of genomics and its challenges such as the training time that takes hours, weeks, and in some cases months.
We propose to apply a distributed deep learning implementation based on Convolutional Neural Networks (CNN) that showed good results in decreasing the training time.
We proved the efficiency of using a distributed strategy based on data-parallelism in predicting the transcription-factor TAL1 motif faster.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Big Data works perfectly along with Deep learning to extract knowledge from a huge amount of data. However, this processing could take a lot of training time. Genomics is a Big Data science with high dimensionality. It relies on deep learning to solve complicated problems in certain diseases like cancer by using different DNA information such as the transcription factor. TAL1 is a transcription factor that is essential for the development of hematopoiesis and of the vascular system. In this paper, we highlight the potential of deep learning in the field of genomics and its challenges such as the training time that takes hours, weeks, and in some cases months. Therefore, we propose to apply a distributed deep learning implementation based on Convolutional Neural Networks (CNN) that showed good results in decreasing the training time and enhancing the accuracy performance with 95% by using multiple GPU and TPU as accelerators. We proved the efficiency of using a distributed strategy based on data-parallelism in predicting the transcription-factor TAL1 motif faster.
Related papers
- Multi-Scale Convolutional LSTM with Transfer Learning for Anomaly Detection in Cellular Networks [1.1432909951914676]
This study introduces a novel approach Multi-Scale Convolutional LSTM with Transfer Learning (TL) to detect anomalies in cellular networks.
The model is initially trained from scratch using a publicly available dataset to learn typical network behavior.
We compare the performance of the model trained from scratch with that of the fine-tuned model using TL.
arXiv Detail & Related papers (2024-09-30T17:51:54Z) - Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems.
We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z) - Scalable training of graph convolutional neural networks for fast and
accurate predictions of HOMO-LUMO gap in molecules [1.8947048356389908]
This work focuses on building GCNN models on HPC systems to predict material properties of millions of molecules.
We use HydraGNN, our in-house library for large-scale GCNN training, leveraging distributed data parallelism in PyTorch.
We perform parallel training on two open-source large-scale graph datasets to build a GCNN predictor for an important quantum property known as the HOMO-LUMO gap.
arXiv Detail & Related papers (2022-07-22T20:54:22Z) - Compare Where It Matters: Using Layer-Wise Regularization To Improve
Federated Learning on Heterogeneous Data [0.0]
Federated Learning is a widely adopted method to train neural networks over distributed data.
One main limitation is the performance degradation that occurs when data is heterogeneously distributed.
We present FedCKA: a framework that out-performs previous state-of-the-art methods on various deep learning tasks.
arXiv Detail & Related papers (2021-12-01T10:46:13Z) - Reducing the Long Tail Losses in Scientific Emulations with Active
Learning [0.0]
In this work, we leveraged an active learning approach called core-set selection to actively select data, per a pre-defined budget, to be labelled for training.
We tested on two case studies in different fields, namely galaxy halo occupation distribution modelling in astrophysics and x-ray emission spectroscopy in plasma physics.
arXiv Detail & Related papers (2021-11-15T09:02:00Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation [53.425900196763756]
We propose a segmentation refinement method based on uncertainty analysis and graph convolutional networks.
We employ the uncertainty levels of the convolutional network in a particular input volume to formulate a semi-supervised graph learning problem.
We show that our method outperforms the state-of-the-art CRF refinement method by improving the dice score by 1% for the pancreas and 2% for spleen.
arXiv Detail & Related papers (2020-12-06T18:55:07Z) - RIFLE: Backpropagation in Depth for Deep Transfer Learning through
Re-Initializing the Fully-connected LayEr [60.07531696857743]
Fine-tuning the deep convolution neural network(CNN) using a pre-trained model helps transfer knowledge learned from larger datasets to the target task.
We propose RIFLE - a strategy that deepens backpropagation in transfer learning settings.
RIFLE brings meaningful updates to the weights of deep CNN layers and improves low-level feature learning.
arXiv Detail & Related papers (2020-07-07T11:27:43Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z) - 3D medical image segmentation with labeled and unlabeled data using
autoencoders at the example of liver segmentation in CT images [58.720142291102135]
This work investigates the potential of autoencoder-extracted features to improve segmentation with a convolutional neural network.
A convolutional autoencoder was used to extract features from unlabeled data and a multi-scale, fully convolutional CNN was used to perform the target task of 3D liver segmentation in CT images.
arXiv Detail & Related papers (2020-03-17T20:20:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.