Understanding Biology in the Age of Artificial Intelligence
- URL: http://arxiv.org/abs/2403.04106v1
- Date: Wed, 6 Mar 2024 23:20:34 GMT
- Title: Understanding Biology in the Age of Artificial Intelligence
- Authors: Elsa Lawrence, Adham El-Shazly, Srijit Seal, Chaitanya K Joshi, Pietro
Li\`o, Shantanu Singh, Andreas Bender, Pietro Sormanni, Matthew Greenig
- Abstract summary: Modern life sciences research is increasingly relying on artificial intelligence approaches to model biological systems.
Although machine learning (ML) models are useful for identifying patterns in large, complex data sets, its widespread application in biological sciences represents a significant deviation from traditional methods of scientific inquiry.
Here, we identify general principles that can guide the design and application of ML systems to model biological phenomena and advance scientific knowledge.
- Score: 4.299566787216408
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern life sciences research is increasingly relying on artificial
intelligence approaches to model biological systems, primarily centered around
the use of machine learning (ML) models. Although ML is undeniably useful for
identifying patterns in large, complex data sets, its widespread application in
biological sciences represents a significant deviation from traditional methods
of scientific inquiry. As such, the interplay between these models and
scientific understanding in biology is a topic with important implications for
the future of scientific research, yet it is a subject that has received little
attention. Here, we draw from an epistemological toolkit to contextualize
recent applications of ML in biological sciences under modern philosophical
theories of understanding, identifying general principles that can guide the
design and application of ML systems to model biological phenomena and advance
scientific knowledge. We propose that conceptions of scientific understanding
as information compression, qualitative intelligibility, and dependency
relation modelling provide a useful framework for interpreting ML-mediated
understanding of biological systems. Through a detailed analysis of two key
application areas of ML in modern biological research - protein structure
prediction and single cell RNA-sequencing - we explore how these features have
thus far enabled ML systems to advance scientific understanding of their target
phenomena, how they may guide the development of future ML models, and the key
obstacles that remain in preventing ML from achieving its potential as a tool
for biological discovery. Consideration of the epistemological features of ML
applications in biology will improve the prospects of these methods to solve
important problems and advance scientific understanding of living systems.
Related papers
- Scientific Machine Learning Seismology [0.0]
Scientific machine learning (SciML) is an interdisciplinary research field that integrates machine learning, particularly deep learning, with physics theory to understand and predict complex natural phenomena.
PINNs and neural operators (NOs) are two popular methods for SciML.
The use of PINNs is expanding into areas such as simultaneous solutions of differential equations, inference in underdetermined systems, and regularization based on physics.
arXiv Detail & Related papers (2024-09-27T02:27:42Z) - LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery [141.39722070734737]
We propose to enhance the knowledge-driven, abstract reasoning abilities of Large Language Models with the computational strength of simulations.
We introduce Scientific Generative Agent (SGA), a bilevel optimization framework.
We conduct experiments to demonstrate our framework's efficacy in law discovery and molecular design.
arXiv Detail & Related papers (2024-05-16T03:04:10Z) - Opportunities for machine learning in scientific discovery [16.526872562935463]
We review how the scientific community can increasingly leverage machine-learning techniques to achieve scientific discoveries.
Although challenges remain, principled use of ML is opening up new avenues for fundamental scientific discoveries.
arXiv Detail & Related papers (2024-05-07T09:58:02Z) - Knowledge-guided Machine Learning: Current Trends and Future Prospects [14.783972088722193]
It also provides an introduction to the current state of research in the emerging field of scientific knowledge-guided machine learning (KGML)
We discuss different facets of KGML research in terms of the type of scientific knowledge used, the form of knowledge-ML integration explored, and the method for incorporating scientific knowledge in ML.
arXiv Detail & Related papers (2024-03-24T02:54:46Z) - Progress and Opportunities of Foundation Models in Bioinformatics [77.74411726471439]
Foundations models (FMs) have ushered in a new era in computational biology, especially in the realm of deep learning.
Central to our focus is the application of FMs to specific biological problems, aiming to guide the research community in choosing appropriate FMs for their research needs.
Review analyses challenges and limitations faced by FMs in biology, such as data noise, model explainability, and potential biases.
arXiv Detail & Related papers (2024-02-06T02:29:17Z) - Diverse Explanations From Data-Driven and Domain-Driven Perspectives in the Physical Sciences [4.442043151145212]
This Perspective explores the sources and implications of diverse explanations in machine learning applications for physical sciences.
We examine how different models, explanation methods, levels of feature attribution, and stakeholder needs can result in varying interpretations of ML outputs.
Our analysis underscores the importance of considering multiple perspectives when interpreting ML models in scientific contexts.
arXiv Detail & Related papers (2024-02-01T05:28:28Z) - Scientific Large Language Models: A Survey on Biological & Chemical Domains [47.97810890521825]
Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension.
The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines.
As a burgeoning area in the community of AI for Science, scientific LLMs warrant comprehensive exploration.
arXiv Detail & Related papers (2024-01-26T05:33:34Z) - ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab [67.24684071577211]
The challenge of replicating research results has posed a significant impediment to the field of molecular biology.
We first curate a comprehensive multimodal dataset, named ProBio, as an initial step towards this objective.
Next, we devise two challenging benchmarks, transparent solution tracking and multimodal action recognition, to emphasize the unique characteristics and difficulties associated with activity understanding in BioLab settings.
arXiv Detail & Related papers (2023-11-01T14:44:01Z) - Causal machine learning for single-cell genomics [94.28105176231739]
We discuss the application of machine learning techniques to single-cell genomics and their challenges.
We first present the model that underlies most of current causal approaches to single-cell biology.
We then identify open problems in the application of causal approaches to single-cell data.
arXiv Detail & Related papers (2023-10-23T13:35:24Z) - Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering.
The main challenges that can be formulated as ML problems are classified into the three main categories.
For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.