Accelerating Understanding of Scientific Experiments with End to End
Symbolic Regression
- URL: http://arxiv.org/abs/2112.04023v1
- Date: Tue, 7 Dec 2021 22:28:53 GMT
- Title: Accelerating Understanding of Scientific Experiments with End to End
Symbolic Regression
- Authors: Nikos Arechiga and Francine Chen and Yan-Ying Chen and Yanxia Zhang
and Rumen Iliev and Heishiro Toyoda and Kent Lyons
- Abstract summary: We develop a deep neural network to address the problem of learning free-form symbolic expressions from raw data.
We train our neural network on a synthetic dataset consisting of data tables of varying length and varying levels of noise.
We validate our technique by running on a public dataset from behavioral science.
- Score: 12.008215939224382
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of learning free-form symbolic expressions from raw
data, such as that produced by an experiment in any scientific domain. Accurate
and interpretable models of scientific phenomena are the cornerstone of
scientific research. Simple yet interpretable models, such as linear or
logistic regression and decision trees often lack predictive accuracy.
Alternatively, accurate blackbox models such as deep neural networks provide
high predictive accuracy, but do not readily admit human understanding in a way
that would enrich the scientific theory of the phenomenon. Many great
breakthroughs in science revolve around the development of parsimonious
equational models with high predictive accuracy, such as Newton's laws,
universal gravitation, and Maxwell's equations. Previous work on automating the
search of equational models from data combine domain-specific heuristics as
well as computationally expensive techniques, such as genetic programming and
Monte-Carlo search. We develop a deep neural network (MACSYMA) to address the
symbolic regression problem as an end-to-end supervised learning problem.
MACSYMA can generate symbolic expressions that describe a dataset. The
computational complexity of the task is reduced to the feedforward computation
of a neural network. We train our neural network on a synthetic dataset
consisting of data tables of varying length and varying levels of noise, for
which the neural network must learn to produce the correct symbolic expression
token by token. Finally, we validate our technique by running on a public
dataset from behavioral science.
Related papers
- Discovering symbolic expressions with parallelized tree search [59.92040079807524]
Symbolic regression plays a crucial role in scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data.
Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity.
We introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data.
arXiv Detail & Related papers (2024-07-05T10:41:15Z) - Machine-Guided Discovery of a Real-World Rogue Wave Model [0.0]
We present a case study on discovering a new symbolic model for oceanic rogue waves from data using causal analysis, deep learning, parsimony-guided model selection, and symbolic regression.
We apply symbolic regression to distill this black-box model into a mathematical equation that retains the neural network's predictive capabilities.
This showcases how machine learning can facilitate inductive scientific discovery, and paves the way for more accurate rogue wave forecasting.
arXiv Detail & Related papers (2023-11-21T12:50:24Z) - Toward Physically Plausible Data-Driven Models: A Novel Neural Network
Approach to Symbolic Regression [2.7071541526963805]
This paper proposes a novel neural network-based symbolic regression method.
It constructs physically plausible models based on even very small training data sets and prior knowledge about the system.
We experimentally evaluate the approach on four test systems: the TurtleBot 2 mobile robot, the magnetic manipulation system, the equivalent resistance of two resistors in parallel, and the longitudinal force of the anti-lock braking system.
arXiv Detail & Related papers (2023-02-01T22:05:04Z) - Interpretable models for extrapolation in scientific machine learning [0.0]
Complex machine learning algorithms often outperform simple regressions in interpolative settings.
We examine the trade-off between model performance and interpretability across a broad range of science and engineering problems.
arXiv Detail & Related papers (2022-12-16T19:33:28Z) - An Information-Theoretic Analysis of Compute-Optimal Neural Scaling Laws [24.356906682593532]
We study the compute-optimal trade-off between model and training data set sizes for large neural networks.
Our result suggests a linear relation similar to that supported by the empirical analysis of chinchilla.
arXiv Detail & Related papers (2022-12-02T18:46:41Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Recognizing and Verifying Mathematical Equations using Multiplicative
Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence.
Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z) - The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain.
In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z) - The data-driven physical-based equations discovery using evolutionary
approach [77.34726150561087]
We describe the algorithm for the mathematical equations discovery from the given observations data.
The algorithm combines genetic programming with the sparse regression.
It could be used for governing analytical equation discovery as well as for partial differential equations (PDE) discovery.
arXiv Detail & Related papers (2020-04-03T17:21:57Z) - Mean-Field and Kinetic Descriptions of Neural Differential Equations [0.0]
In this work we focus on a particular class of neural networks, i.e. the residual neural networks.
We analyze steady states and sensitivity with respect to the parameters of the network, namely the weights and the bias.
A modification of the microscopic dynamics, inspired by residual neural networks, leads to a Fokker-Planck formulation of the network.
arXiv Detail & Related papers (2020-01-07T13:41:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.