Evolving Scientific Discovery by Unifying Data and Background Knowledge with AI Hilbert
- URL: http://arxiv.org/abs/2308.09474v3
- Date: Mon, 29 Apr 2024 13:46:56 GMT
- Title: Evolving Scientific Discovery by Unifying Data and Background Knowledge with AI Hilbert
- Authors: Ryan Cory-Wright, Cristina Cornelio, Sanjeeb Dash, Bachir El Khadir, Lior Horesh,
- Abstract summary: We show that some famous scientific laws, including Kepler's Third Law of Planetary Motion, can be derived in a principled manner from axioms and data.
We model complexity using binary variables and logical constraints, solve optimization problems via mixed-integer linear or semidefinite optimization, and prove the validity of our scientific discoveries.
- Score: 11.56572994087136
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The discovery of scientific formulae that parsimoniously explain natural phenomena and align with existing background theory is a key goal in science. Historically, scientists have derived natural laws by manipulating equations based on existing knowledge, forming new equations, and verifying them experimentally. In recent years, data-driven scientific discovery has emerged as a viable competitor in settings with large amounts of experimental data. Unfortunately, data-driven methods often fail to discover valid laws when data is noisy or scarce. Accordingly, recent works combine regression and reasoning to eliminate formulae inconsistent with background theory. However, the problem of searching over the space of formulae consistent with background theory to find one that best fits the data is not well-solved. We propose a solution to this problem when all axioms and scientific laws are expressible via polynomial equalities and inequalities and argue that our approach is widely applicable. We model notions of minimal complexity using binary variables and logical constraints, solve polynomial optimization problems via mixed-integer linear or semidefinite optimization, and prove the validity of our scientific discoveries in a principled manner using Positivstellensatz certificates. The optimization techniques leveraged in this paper allow our approach to run in polynomial time with fully correct background theory under an assumption that the complexity of our derivation is bounded), or non-deterministic polynomial (NP) time with partially correct background theory. We demonstrate that some famous scientific laws, including Kepler's Third Law of Planetary Motion, the Hagen-Poiseuille Equation, and the Radiated Gravitational Wave Power equation, can be derived in a principled manner from axioms and experimental data.
Related papers
- Argumentative Causal Discovery [13.853426822028975]
Causal discovery amounts to unearthing causal relationships amongst features in data.
We deploy assumption-based argumentation (ABA) to learn graphs which reflect causal dependencies in the data.
We prove that our method exhibits desirable properties, notably that, under natural conditions, it can retrieve ground-truth causal graphs.
arXiv Detail & Related papers (2024-05-18T10:34:34Z) - LLM-SR: Scientific Equation Discovery via Programming with Large Language Models [17.64574496035502]
Traditional methods of equation discovery, known as symbolic regression, largely focus on extracting equations from data alone.
We introduce LLM-SR, a novel approach that leverages the scientific knowledge and robust code generation capabilities of Large Language Models.
We demonstrate LLM-SR's effectiveness across three diverse scientific domains, where it discovers physically accurate equations.
arXiv Detail & Related papers (2024-04-29T03:30:06Z) - Directed differential equation discovery using modified mutation and
cross-over operators [77.34726150561087]
We introduce the modifications that can be introduced into the evolutionary operators of the equation discovery algorithm.
The resulting approach, dubbed directed equation discovery, demonstrates a greater ability to converge towards accurate solutions.
arXiv Detail & Related papers (2023-08-09T14:50:02Z) - Comparison of Single- and Multi- Objective Optimization Quality for
Evolutionary Equation Discovery [77.34726150561087]
Evolutionary differential equation discovery proved to be a tool to obtain equations with less a priori assumptions.
The proposed comparison approach is shown on classical model examples -- Burgers equation, wave equation, and Korteweg - de Vries equation.
arXiv Detail & Related papers (2023-06-29T15:37:19Z) - Well-definedness of Physical Law Learning: The Uniqueness Problem [63.9246169579248]
Physical law learning is the ambiguous attempt at automating the derivation of governing equations with the use of machine learning techniques.
This paper shall serve as a first step to build a comprehensive theoretical framework for learning physical laws.
arXiv Detail & Related papers (2022-10-15T17:32:49Z) - AI Research Associate for Early-Stage Scientific Discovery [1.6861004263551447]
Artificial intelligence (AI) has been increasingly applied in scientific activities for decades.
We present an AI research associate for early-stage scientific discovery based on a novel minimally-biased physics-based modeling.
arXiv Detail & Related papers (2022-02-02T17:05:52Z) - Partial Counterfactual Identification from Observational and
Experimental Data [83.798237968683]
We develop effective Monte Carlo algorithms to approximate the optimal bounds from an arbitrary combination of observational and experimental data.
Our algorithms are validated extensively on synthetic and real-world datasets.
arXiv Detail & Related papers (2021-10-12T02:21:30Z) - Integration of Data and Theory for Accelerated Derivable Symbolic
Discovery [3.7521856498259627]
We develop a methodology combining automated theorem proving with symbolic regression, enabling principled derivations of laws of nature.
We demonstrate this for Kepler's third law, Einstein's relativistic time dilation, and Langmuir's theory of adsorbing.
The combination of logical reasoning with machine learning provides generalizable insights into key aspects of the natural phenomena.
arXiv Detail & Related papers (2021-09-03T17:19:17Z) - Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs.
We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables.
We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z) - The data-driven physical-based equations discovery using evolutionary
approach [77.34726150561087]
We describe the algorithm for the mathematical equations discovery from the given observations data.
The algorithm combines genetic programming with the sparse regression.
It could be used for governing analytical equation discovery as well as for partial differential equations (PDE) discovery.
arXiv Detail & Related papers (2020-04-03T17:21:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.