Opportunities for machine learning in scientific discovery
- URL: http://arxiv.org/abs/2405.04161v1
- Date: Tue, 7 May 2024 09:58:02 GMT
- Title: Opportunities for machine learning in scientific discovery
- Authors: Ricardo Vinuesa, Jean Rabault, Hossein Azizpour, Stefan Bauer, Bingni W. Brunton, Arne Elofsson, Elias Jarlebring, Hedvig Kjellstrom, Stefano Markidis, David Marlevi, Paola Cinnella, Steven L. Brunton,
- Abstract summary: We review how the scientific community can increasingly leverage machine-learning techniques to achieve scientific discoveries.
Although challenges remain, principled use of ML is opening up new avenues for fundamental scientific discoveries.
- Score: 16.526872562935463
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Technological advancements have substantially increased computational power and data availability, enabling the application of powerful machine-learning (ML) techniques across various fields. However, our ability to leverage ML methods for scientific discovery, {\it i.e.} to obtain fundamental and formalized knowledge about natural processes, is still in its infancy. In this review, we explore how the scientific community can increasingly leverage ML techniques to achieve scientific discoveries. We observe that the applicability and opportunity of ML depends strongly on the nature of the problem domain, and whether we have full ({\it e.g.}, turbulence), partial ({\it e.g.}, computational biochemistry), or no ({\it e.g.}, neuroscience) {\it a-priori} knowledge about the governing equations and physical properties of the system. Although challenges remain, principled use of ML is opening up new avenues for fundamental scientific discoveries. Throughout these diverse fields, there is a theme that ML is enabling researchers to embrace complexity in observational data that was previously intractable to classic analysis and numerical investigations.
Related papers
- A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery [68.48094108571432]
We aim to provide a more holistic view of the research landscape by unveiling cross-field and cross-modal connections between scientific LLMs.
We comprehensively survey over 250 scientific LLMs, discuss their commonalities and differences, as well as summarize pre-training datasets and evaluation tasks for each field and modality.
arXiv Detail & Related papers (2024-06-16T08:03:24Z) - Understanding Biology in the Age of Artificial Intelligence [4.299566787216408]
Modern life sciences research is increasingly relying on artificial intelligence approaches to model biological systems.
Although machine learning (ML) models are useful for identifying patterns in large, complex data sets, its widespread application in biological sciences represents a significant deviation from traditional methods of scientific inquiry.
Here, we identify general principles that can guide the design and application of ML systems to model biological phenomena and advance scientific knowledge.
arXiv Detail & Related papers (2024-03-06T23:20:34Z) - Scientific Large Language Models: A Survey on Biological & Chemical Domains [47.97810890521825]
Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension.
The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines.
As a burgeoning area in the community of AI for Science, scientific LLMs warrant comprehensive exploration.
arXiv Detail & Related papers (2024-01-26T05:33:34Z) - Artificial Intelligence for Science in Quantum, Atomistic, and Continuum
Systems [245.1050780515017]
New area of research known as AI for science (AI4Science)
Areas aim at understanding the physical world from subatomic (wavefunctions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales.
Key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods.
arXiv Detail & Related papers (2023-07-17T12:14:14Z) - The Future of Fundamental Science Led by Generative Closed-Loop
Artificial Intelligence [67.70415658080121]
Recent advances in machine learning and AI are disrupting technological innovation, product development, and society as a whole.
AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access.
Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery.
arXiv Detail & Related papers (2023-07-09T21:16:56Z) - Scientific intuition inspired by machine learning generated hypotheses [2.294014185517203]
We shift the focus on the insights and the knowledge obtained by the machine learning models themselves.
We apply gradient boosting in decision trees to extract human interpretable insights from big data sets from chemistry and physics.
The ability to go beyond numerics opens the door to use machine learning to accelerate the discovery of conceptual understanding.
arXiv Detail & Related papers (2020-10-27T12:12:12Z) - Machine Learning Force Fields [54.48599172620472]
Machine Learning (ML) has enabled numerous advances in computational chemistry.
One of the most promising applications is the construction of ML-based force fields (FFs)
This review gives an overview of applications of ML-FFs and the chemical insights that can be obtained from them.
arXiv Detail & Related papers (2020-10-14T13:14:14Z) - Workflow Provenance in the Lifecycle of Scientific Machine Learning [1.6118907823528272]
We leverage workflow techniques to build a holistic view to support the lifecycle of scientific ML.
We contribute with (i) characterization of the lifecycle and taxonomy for data analyses; (ii) design principles to build this view, with a W3C PROV compliant data representation and a reference system architecture; and (iii) lessons learned after an evaluation in an Oil & Gas case using an HPC cluster with 393 nodes and 946 GPUs.
arXiv Detail & Related papers (2020-09-30T13:09:48Z) - Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering.
The main challenges that can be formulated as ML problems are classified into the three main categories.
For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.