A Review of Deep Learning Techniques for Protein Function Prediction
- URL: http://arxiv.org/abs/2211.09705v1
- Date: Thu, 27 Oct 2022 20:30:25 GMT
- Title: A Review of Deep Learning Techniques for Protein Function Prediction
- Authors: Divyanshu Aggarwal and Yasha Hasija
- Abstract summary: This review paper analyzes the recent developments in approaches for the task of predicting protein function using deep learning.
We highlight the emergence of the modern State of The Art (SOTA) deep learning models which have achieved groundbreaking results in the field of computer vision, natural language processing and multi-modal learning.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning and big data have shown tremendous success in bioinformatics
and computational biology in recent years; artificial intelligence methods have
also significantly contributed in the task of protein function classification.
This review paper analyzes the recent developments in approaches for the task
of predicting protein function using deep learning. We explain the importance
of determining the protein function and why automating the following task is
crucial. Then, after reviewing the widely used deep learning techniques for
this task, we continue our review and highlight the emergence of the modern
State of The Art (SOTA) deep learning models which have achieved groundbreaking
results in the field of computer vision, natural language processing and
multi-modal learning in the last few years. We hope that this review will
provide a broad view of the current role and advances of deep learning in
biological sciences, especially in predicting protein function tasks and
encourage new researchers to contribute to this area.
Related papers
- Opportunities in deep learning methods development for computational biology [0.0]
Molecular technologies underlie an enormous growth in the size of data sets pertaining to biology and biomedicine.
These advances parallel those in the deep learning subfield of machine learning.
Many of these tools have not fully proliferated into the computational biology and bioinformatics fields.
arXiv Detail & Related papers (2024-06-12T22:58:45Z) - An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks.
These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems.
Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z) - ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab [67.24684071577211]
The challenge of replicating research results has posed a significant impediment to the field of molecular biology.
We first curate a comprehensive multimodal dataset, named ProBio, as an initial step towards this objective.
Next, we devise two challenging benchmarks, transparent solution tracking and multimodal action recognition, to emphasize the unique characteristics and difficulties associated with activity understanding in BioLab settings.
arXiv Detail & Related papers (2023-11-01T14:44:01Z) - Deep Learning in Computational Biology: Advancements, Challenges, and
Future Outlook [0.0]
We examine the history, advantages, and challenges of deep learning in computational biology.
Our focus is on two primary applications: DNA sequence classification and prediction, as well as protein structure prediction from sequence data.
arXiv Detail & Related papers (2023-10-02T07:53:05Z) - Integration of Pre-trained Protein Language Models into Geometric Deep
Learning Networks [68.90692290665648]
We integrate knowledge learned by protein language models into several state-of-the-art geometric networks.
Our findings show an overall improvement of 20% over baselines.
Strong evidence indicates that the incorporation of protein language models' knowledge enhances geometric networks' capacity by a significant margin.
arXiv Detail & Related papers (2022-12-07T04:04:04Z) - Deep Learning Methods for Protein Family Classification on PDB
Sequencing Data [0.0]
We demonstrate and compare the performance of several deep learning frameworks, including novel bi-directional LSTM and convolutional models, on widely available sequencing data.
Our results show that our deep learning models deliver superior performance to classical machine learning methods, with the convolutional architecture providing the most impressive inference performance.
arXiv Detail & Related papers (2022-07-14T06:11:32Z) - Learning multi-scale functional representations of proteins from
single-cell microscopy data [77.34726150561087]
We show that simple convolutional networks trained on localization classification can learn protein representations that encapsulate diverse functional information.
We also propose a robust evaluation strategy to assess quality of protein representations across different scales of biological function.
arXiv Detail & Related papers (2022-05-24T00:00:07Z) - Structure-aware Protein Self-supervised Learning [50.04673179816619]
We propose a novel structure-aware protein self-supervised learning method to capture structural information of proteins.
In particular, a well-designed graph neural network (GNN) model is pretrained to preserve the protein structural information.
We identify the relation between the sequential information in the protein language model and the structural information in the specially designed GNN model via a novel pseudo bi-level optimization scheme.
arXiv Detail & Related papers (2022-04-06T02:18:41Z) - Ten Quick Tips for Deep Learning in Biology [116.78436313026478]
Machine learning is concerned with the development and applications of algorithms that can recognize patterns in data and use them for predictive modeling.
Deep learning has become its own subfield of machine learning.
In the context of biological research, deep learning has been increasingly used to derive novel insights from high-dimensional biological data.
arXiv Detail & Related papers (2021-05-29T21:02:44Z) - Deep Learning in Protein Structural Modeling and Design [6.282267356230666]
Deep learning is catalyzing a scientific revolution fueled by big data, accessible toolkits, and powerful computational resources.
Protein structural modeling is critical to understand and engineer biological systems at the molecular level.
This review is directed to help both computational biologists to gain familiarity with the deep learning methods applied in protein modeling, and computer scientists to gain perspective on the biologically meaningful problems that may benefit from deep learning techniques.
arXiv Detail & Related papers (2020-07-16T14:59:38Z) - Machine learning and AI-based approaches for bioactive ligand discovery
and GPCR-ligand recognition [2.842794675894731]
Deep learning has been shown to outperform not only conventional machine learning but also highly specialized tools.
We highlight the latest AI-based research that has led to the successful discovery of GPCR bioactive molecules.
This review concludes with a brief outlook highlighting the recent research trends in deep learning.
arXiv Detail & Related papers (2020-01-17T22:01:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.