When is a System Discoverable from Data? Discovery Requires Chaos
- URL: http://arxiv.org/abs/2511.08860v1
- Date: Thu, 13 Nov 2025 01:12:31 GMT
- Title: When is a System Discoverable from Data? Discovery Requires Chaos
- Authors: Zakhar Shumaylov, Peter Zaika, Philipp Scholl, Gitta Kutyniok, Lior Horesh, Carola-Bibiane Schönlieb,
- Abstract summary: We show that chaos is crucial for ensuring a system is discoverable in the space of continuous or analytic functions.<n>We demonstrate for the first time that the classical Lorenz system is analytically discoverable.<n>These findings help explain the success of data-driven methods in inherently chaotic domains like weather forecasting.
- Score: 36.78844761101327
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The deep learning revolution has spurred a rise in advances of using AI in sciences. Within physical sciences the main focus has been on discovery of dynamical systems from observational data. Yet the reliability of learned surrogates and symbolic models is often undermined by the fundamental problem of non-uniqueness. The resulting models may fit the available data perfectly, but lack genuine predictive power. This raises the question: under what conditions can the systems governing equations be uniquely identified from a finite set of observations? We show, counter-intuitively, that chaos, typically associated with unpredictability, is crucial for ensuring a system is discoverable in the space of continuous or analytic functions. The prevalence of chaotic systems in benchmark datasets may have inadvertently obscured this fundamental limitation. More concretely, we show that systems chaotic on their entire domain are discoverable from a single trajectory within the space of continuous functions, and systems chaotic on a strange attractor are analytically discoverable under a geometric condition on the attractor. As a consequence, we demonstrate for the first time that the classical Lorenz system is analytically discoverable. Moreover, we establish that analytic discoverability is impossible in the presence of first integrals, common in real-world systems. These findings help explain the success of data-driven methods in inherently chaotic domains like weather forecasting, while revealing a significant challenge for engineering applications like digital twins, where stable, predictable behavior is desired. For these non-chaotic systems, we find that while trajectory data alone is insufficient, certain prior physical knowledge can help ensure discoverability. These findings warrant a critical re-evaluation of the fundamental assumptions underpinning purely data-driven discovery.
Related papers
- Physics as the Inductive Bias for Causal Discovery [7.9653270330458446]
Causal discovery is often a data-driven paradigm to analyze complex real-world systems.<n>We develop a scalable sparsity-inducing MLE algorithm that exploits causal graph structure for efficient parameter estimation.
arXiv Detail & Related papers (2026-02-03T23:42:01Z) - Research Program: Theory of Learning in Dynamical Systems [29.121933501690805]
We argue that learnability in dynamical systems should be studied as a finite-sample question.<n>We focus on guarantees that hold uniformly at every time step after a finite burn-in period.<n>We show that accurate prediction can be achieved after finite observation without system identification.
arXiv Detail & Related papers (2025-12-22T14:05:31Z) - Identifiability Challenges in Sparse Linear Ordinary Differential Equations [4.895067344504143]
We show that sparse systems are unidentifiable with a positive probability in practically relevant sparsity regimes.<n>We further study empirically how this theoretical unidentifiability manifests in state-of-the-art methods to estimate linear ODEs from data.
arXiv Detail & Related papers (2025-06-11T14:55:36Z) - Building Machine Learning Challenges for Anomaly Detection in Science [94.24422981343699]
We present three datasets aimed at developing machine learning-based anomaly detection for disparate scientific domains.<n>We present a scheme to make machine learning challenges around the three datasets findable, accessible, interoperable, and reusable.
arXiv Detail & Related papers (2025-03-03T22:54:07Z) - Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations [114.17826109037048]
Ordinary Differential Equations (ODEs) have recently gained a lot of attention in machine learning.
theoretical aspects, e.g., identifiability and properties of statistical estimation are still obscure.
This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a single trajectory.
arXiv Detail & Related papers (2022-10-12T06:46:38Z) - Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data.
We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism.
We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Beyond Predictions in Neural ODEs: Identification and Interventions [7.04645578771455]
Given large amounts of observational data about a system, can we uncover the rules that govern its evolution?<n>We show that combining simple regularization schemes with flexible neural ODEs can robustly recover the dynamics and causal structures from time-series data.<n>We conclude by showing that we can also make accurate predictions under interventions on variables or the system itself.
arXiv Detail & Related papers (2021-06-23T14:35:38Z) - Universal set of Observables for Forecasting Physical Systems through
Causal Embedding [0.0]
We demonstrate when and how an entire left-infinite orbit of an underlying dynamical system or observations can be uniquely represented by a pair of elements in a different space.
The collection of such pairs is derived from a driven dynamical system and is used to learn a function which together with the driven system would: (i.) determine a system that is topologically conjugate to the underlying system.
arXiv Detail & Related papers (2021-05-22T16:28:57Z) - Consistency of mechanistic causal discovery in continuous-time using
Neural ODEs [85.7910042199734]
We consider causal discovery in continuous-time for the study of dynamical systems.
We propose a causal discovery algorithm based on penalized Neural ODEs.
arXiv Detail & Related papers (2021-05-06T08:48:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.