Self-Distillation Improves DNA Sequence Inference
- URL: http://arxiv.org/abs/2405.08538v1
- Date: Tue, 14 May 2024 12:24:52 GMT
- Title: Self-Distillation Improves DNA Sequence Inference
- Authors: Tong Yu, Lei Cheng, Ruslan Khalitov, Erland Brandser Olsson, Zhirong Yang,
- Abstract summary: Self-supervised pretraining (SSP) has been recognized as a method to enhance prediction accuracy in various downstream tasks.
This limitation stems primarily from the fact that most existing SSP approaches in genomics focus on masked language modeling of individual sequences.
We introduce an innovative deep neural network model, which incorporates collaborative learning between a student' and a teacher' subnetwork.
- Score: 15.497250990633047
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised pretraining (SSP) has been recognized as a method to enhance prediction accuracy in various downstream tasks. However, its efficacy for DNA sequences remains somewhat constrained. This limitation stems primarily from the fact that most existing SSP approaches in genomics focus on masked language modeling of individual sequences, neglecting the crucial aspect of encoding statistics across multiple sequences. To overcome this challenge, we introduce an innovative deep neural network model, which incorporates collaborative learning between a `student' and a `teacher' subnetwork. In this model, the student subnetwork employs masked learning on nucleotides and progressively adapts its parameters to the teacher subnetwork through an exponential moving average approach. Concurrently, both subnetworks engage in contrastive learning, deriving insights from two augmented representations of the input sequences. This self-distillation process enables our model to effectively assimilate both contextual information from individual sequences and distributional data across the sequence population. We validated our approach with preliminary pretraining using the human reference genome, followed by applying it to 20 downstream inference tasks. The empirical results from these experiments demonstrate that our novel method significantly boosts inference performance across the majority of these tasks. Our code is available at https://github.com/wiedersehne/FinDNA.
Related papers
- Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation.
In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales.
Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z) - Finding the DeepDream for Time Series: Activation Maximization for Univariate Time Series [10.388704631887496]
We introduce Sequence Dreaming, a technique that adapts Maxim Activationization to analyze sequential information.
We visualize the temporal dynamics and patterns most influential in model decision-making processes.
arXiv Detail & Related papers (2024-08-20T08:09:44Z) - Continual Learning via Sequential Function-Space Variational Inference [65.96686740015902]
We propose an objective derived by formulating continual learning as sequential function-space variational inference.
Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions.
We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods.
arXiv Detail & Related papers (2023-12-28T18:44:32Z) - Toward Understanding BERT-Like Pre-Training for DNA Foundation Models [78.48760388079523]
Existing pre-training methods for DNA sequences rely on direct adoptions of BERT pre-training from NLP.
We introduce a novel approach called RandomMask, which gradually increases the task difficulty of BERT-like pre-training by continuously expanding its mask boundary.
RandomMask achieves a staggering 68.16% in Matthew's correlation coefficient for Epigenetic Mark Prediction, a groundbreaking increase of 19.85% over the baseline.
arXiv Detail & Related papers (2023-10-11T16:40:57Z) - Enhancing Cross-Dataset Performance of Distracted Driving Detection With
Score-Softmax Classifier [7.302402275736439]
Deep neural networks enable real-time monitoring of in-vehicle driver, facilitating the timely prediction of distractions, fatigue, and potential hazards.
Recent research has exposed unreliable cross-dataset end-to-end driver behavior recognition due to overfitting.
We introduce the Score-Softmax classifier, which addresses this issue by enhancing inter-class independence and Intra-class uncertainty.
arXiv Detail & Related papers (2023-10-08T15:28:01Z) - PIGNet2: A Versatile Deep Learning-based Protein-Ligand Interaction
Prediction Model for Binding Affinity Scoring and Virtual Screening [0.0]
Prediction of protein-ligand interactions (PLI) plays a crucial role in drug discovery.
The development of a versatile model capable of accurately scoring binding affinity and conducting efficient virtual screening remains a challenge.
Here, we propose a viable solution by introducing a novel data augmentation strategy combined with a physics-informed graph neural network.
arXiv Detail & Related papers (2023-07-03T14:46:49Z) - The impact of memory on learning sequence-to-sequence tasks [6.603326895384289]
Recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks.
We propose a model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences.
arXiv Detail & Related papers (2022-05-29T14:57:33Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Compare Where It Matters: Using Layer-Wise Regularization To Improve
Federated Learning on Heterogeneous Data [0.0]
Federated Learning is a widely adopted method to train neural networks over distributed data.
One main limitation is the performance degradation that occurs when data is heterogeneously distributed.
We present FedCKA: a framework that out-performs previous state-of-the-art methods on various deep learning tasks.
arXiv Detail & Related papers (2021-12-01T10:46:13Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.