Related papers: Opportunities for machine learning in scientific discovery

Opportunities for machine learning in scientific discovery

URL: http://arxiv.org/abs/2405.04161v1
Date: Tue, 7 May 2024 09:58:02 GMT
Title: Opportunities for machine learning in scientific discovery
Authors: Ricardo Vinuesa, Jean Rabault, Hossein Azizpour, Stefan Bauer, Bingni W. Brunton, Arne Elofsson, Elias Jarlebring, Hedvig Kjellstrom, Stefano Markidis, David Marlevi, Paola Cinnella, Steven L. Brunton,
Abstract summary: We review how the scientific community can increasingly leverage machine-learning techniques to achieve scientific discoveries. Although challenges remain, principled use of ML is opening up new avenues for fundamental scientific discoveries.
Score: 16.526872562935463
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Technological advancements have substantially increased computational power and data availability, enabling the application of powerful machine-learning (ML) techniques across various fields. However, our ability to leverage ML methods for scientific discovery, {\it i.e.} to obtain fundamental and formalized knowledge about natural processes, is still in its infancy. In this review, we explore how the scientific community can increasingly leverage ML techniques to achieve scientific discoveries. We observe that the applicability and opportunity of ML depends strongly on the nature of the problem domain, and whether we have full ({\it e.g.}, turbulence), partial ({\it e.g.}, computational biochemistry), or no ({\it e.g.}, neuroscience) {\it a-priori} knowledge about the governing equations and physical properties of the system. Although challenges remain, principled use of ML is opening up new avenues for fundamental scientific discoveries. Throughout these diverse fields, there is a theme that ML is enabling researchers to embrace complexity in observational data that was previously intractable to classic analysis and numerical investigations.

Related papers

Scaling Laws in Scientific Discovery with AI and Robot Scientists [72.3420699173245]
An autonomous generalist scientist (AGS) concept combines agentic AI and embodied robotics to automate the entire research lifecycle. AGS aims to significantly reduce the time and resources needed for scientific discovery. As these autonomous systems become increasingly integrated into the research process, we hypothesize that scientific discovery might adhere to new scaling laws.
arXiv Detail & Related papers (2025-03-28T14:00:27Z)
Building Machine Learning Challenges for Anomaly Detection in Science [94.24422981343699]
We present three datasets aimed at developing machine learning-based anomaly detection for disparate scientific domains. We present a scheme to make machine learning challenges around the three datasets findable, accessible, interoperable, and reusable.
arXiv Detail & Related papers (2025-03-03T22:54:07Z)
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning [51.11965014462375]
Multimodal Large Language Models (MLLMs) integrate text, images, and other modalities. This paper argues that MLLMs can significantly advance scientific reasoning across disciplines such as mathematics, physics, chemistry, and biology.
arXiv Detail & Related papers (2025-02-05T04:05:27Z)
Scientific Machine Learning Seismology [0.0]
Scientific machine learning (SciML) is an interdisciplinary research field that integrates machine learning, particularly deep learning, with physics theory to understand and predict complex natural phenomena. PINNs and neural operators (NOs) are two popular methods for SciML. The use of PINNs is expanding into areas such as simultaneous solutions of differential equations, inference in underdetermined systems, and regularization based on physics.
arXiv Detail & Related papers (2024-09-27T02:27:42Z)
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery [68.48094108571432]
Large language models (LLMs) have revolutionized the way text and other modalities of data are handled. We aim to provide a more holistic view of the research landscape by unveiling cross-field and cross-modal connections between scientific LLMs.
arXiv Detail & Related papers (2024-06-16T08:03:24Z)
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer [22.90687836544612]
We present bfUni-text, an innovative model designed for in-depth understanding of scientific literature. Uni-text demonstrates superior performance over other text-focused LLMs. Our exploration extends to practical applications, including patent infringement detection and nuanced analysis of charts.
arXiv Detail & Related papers (2024-03-15T13:43:47Z)
Understanding Biology in the Age of Artificial Intelligence [4.299566787216408]
Modern life sciences research is increasingly relying on artificial intelligence approaches to model biological systems. Although machine learning (ML) models are useful for identifying patterns in large, complex data sets, its widespread application in biological sciences represents a significant deviation from traditional methods of scientific inquiry. Here, we identify general principles that can guide the design and application of ML systems to model biological phenomena and advance scientific knowledge.
arXiv Detail & Related papers (2024-03-06T23:20:34Z)
Diverse Explanations From Data-Driven and Domain-Driven Perspectives in the Physical Sciences [4.442043151145212]
This Perspective explores the sources and implications of diverse explanations in machine learning applications for physical sciences. We examine how different models, explanation methods, levels of feature attribution, and stakeholder needs can result in varying interpretations of ML outputs. Our analysis underscores the importance of considering multiple perspectives when interpreting ML models in scientific contexts.
arXiv Detail & Related papers (2024-02-01T05:28:28Z)
Scientific Large Language Models: A Survey on Biological & Chemical Domains [47.97810890521825]
Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. As a burgeoning area in the community of AI for Science, scientific LLMs warrant comprehensive exploration.
arXiv Detail & Related papers (2024-01-26T05:33:34Z)
Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems [268.585904751315]
New area of research known as AI for science (AI4Science) Areas aim at understanding the physical world from subatomic (wavefunctions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales. Key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods.
arXiv Detail & Related papers (2023-07-17T12:14:14Z)
The Future of Fundamental Science Led by Generative Closed-Loop Artificial Intelligence [67.70415658080121]
Recent advances in machine learning and AI are disrupting technological innovation, product development, and society as a whole. AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access. Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery.
arXiv Detail & Related papers (2023-07-09T21:16:56Z)
REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research [19.71032778307425]
Transparency around limitations can improve the scientific rigor of research, help ensure appropriate interpretation of research findings, and make research claims more credible. Despite these benefits, the machine learning (ML) research community lacks well-developed norms around disclosing and discussing limitations. We conduct an iterative design process with 30 ML and ML-adjacent researchers to develop REAL ML, a set of guided activities to help ML researchers recognize, explore, and articulate the limitations of their research.
arXiv Detail & Related papers (2022-05-05T15:32:45Z)
Machine Learning Force Fields [54.48599172620472]
Machine Learning (ML) has enabled numerous advances in computational chemistry. One of the most promising applications is the construction of ML-based force fields (FFs) This review gives an overview of applications of ML-FFs and the chemical insights that can be obtained from them.
arXiv Detail & Related papers (2020-10-14T13:14:14Z)
Workflow Provenance in the Lifecycle of Scientific Machine Learning [1.6118907823528272]
We leverage workflow techniques to build a holistic view to support the lifecycle of scientific ML. We contribute with (i) characterization of the lifecycle and taxonomy for data analyses; (ii) design principles to build this view, with a W3C PROV compliant data representation and a reference system architecture; and (iii) lessons learned after an evaluation in an Oil & Gas case using an HPC cluster with 393 nodes and 946 GPUs.
arXiv Detail & Related papers (2020-09-30T13:09:48Z)
Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering. The main challenges that can be formulated as ML problems are classified into the three main categories. For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.