The Observer Effect in World Models: Invasive Adaptation Corrupts Latent Physics
- URL: http://arxiv.org/abs/2602.12218v1
- Date: Thu, 12 Feb 2026 17:56:07 GMT
- Title: The Observer Effect in World Models: Invasive Adaptation Corrupts Latent Physics
- Authors: Christian Internò, Jumpei Yamaguchi, Loren Amdahl-Culleton, Markus Olhofer, David Klindt, Barbara Hammer,
- Abstract summary: We propose a non-invasive evaluation protocol, PhyIP, to test whether physical quantities are linearly decodable from frozen representations.<n>Across fluid dynamics and orbital mechanics, we find that when SSL achieves low error, latent structure becomes linearly accessible.
- Score: 5.20694680088186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Determining whether neural models internalize physical laws as world models, rather than exploiting statistical shortcuts, remains challenging, especially under out-of-distribution (OOD) shifts. Standard evaluations often test latent capability via downstream adaptation (e.g., fine-tuning or high-capacity probes), but such interventions can change the representations being measured and thus confound what was learned during self-supervised learning (SSL). We propose a non-invasive evaluation protocol, PhyIP. We test whether physical quantities are linearly decodable from frozen representations, motivated by the linear representation hypothesis. Across fluid dynamics and orbital mechanics, we find that when SSL achieves low error, latent structure becomes linearly accessible. PhyIP recovers internal energy and Newtonian inverse-square scaling on OOD tests (e.g., $ρ> 0.90$). In contrast, adaptation-based evaluations can collapse this structure ($ρ\approx 0.05$). These findings suggest that adaptation-based evaluation can obscure latent structures and that low-capacity probes offer a more accurate evaluation of physical world models.
Related papers
- Learning Complex Physical Regimes via Coverage-oriented Uncertainty Quantification: An application to the Critical Heat Flux [0.0]
Uncertainty quantification (UQ) should not be viewed as a safety assessment, but as a support to the learning task itself.<n>We focus on the Critical Heat Flux benchmark and dataset presented by the OECD/NEA Expert Group on Reactor Systems Multi-Physics.<n>We show that while post-hoc methods ensure statistical calibration, coverage-oriented learning effectively reshapes the model's representation to match the complex physical regimes.
arXiv Detail & Related papers (2026-02-25T09:04:15Z) - From Evaluation to Design: Using Potential Energy Surface Smoothness Metrics to Guide Machine Learning Interatomic Potential Architectures [12.68400434984463]
MLIPs fail to reproduce the physical smoothness of the quantum potential energy surface.<n>Existing evaluations, such as microcanonical molecular dynamics, are computationally expensive and primarily probe near-equilibrium states.<n>We introduce the Bond Smoothness Characterization Test (BSCT) to improve evaluation metrics for MLIPs.
arXiv Detail & Related papers (2026-02-04T18:50:10Z) - An Orthogonal Learner for Individualized Outcomes in Markov Decision Processes [55.93922317950527]
We develop a novel meta-learner called DRQ-learner.<n>Our DRQ-learner is applicable to settings with both discrete and continuous state spaces.
arXiv Detail & Related papers (2025-09-30T15:49:29Z) - From Physics to Machine Learning and Back: Part II - Learning and Observational Bias in PHM [52.64097278841485]
Review examines how incorporating learning and observational biases through physics-informed modeling and data strategies can guide models toward physically consistent and reliable predictions.<n>Fast adaptation methods including meta-learning and few-shot learning are reviewed alongside domain generalization techniques.
arXiv Detail & Related papers (2025-09-25T14:15:43Z) - Potential failures of physics-informed machine learning in traffic flow modeling: theoretical and experimental analysis [5.055539099879598]
This study investigates why physics-informed machine learning (PIML) can fail in macroscopic traffic flow modeling.<n>We define failure as cases where a PIML model underperforms both purely data-driven and purely physics-based baselines by a given threshold.<n>This explains why LWR-based PIML can outperform ARZ-based PIML even with high-resolution data, with the gap shrinking as resolution increases.
arXiv Detail & Related papers (2025-05-16T17:55:06Z) - Automated Model Discovery for Tensional Homeostasis: Constitutive Machine Learning in Growth and Remodeling [0.0]
We extend our inelastic Constitutive Artificial Neural Networks (iCANNs) by incorporating kinematic growth and homeostatic surfaces.
We evaluate the ability of the proposed network to learn from experimentally obtained tissue equivalent data at the material point level.
arXiv Detail & Related papers (2024-10-17T15:12:55Z) - Physics-Informed Neural Networks with Hard Linear Equality Constraints [9.101849365688905]
This work proposes a novel physics-informed neural network, KKT-hPINN, which rigorously guarantees hard linear equality constraints.
Experiments on Aspen models of a stirred-tank reactor unit, an extractive distillation subsystem, and a chemical plant demonstrate that this model can further enhance the prediction accuracy.
arXiv Detail & Related papers (2024-02-11T17:40:26Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Latent Traversals in Generative Models as Potential Flows [113.4232528843775]
We propose to model latent structures with a learned dynamic potential landscape.
Inspired by physics, optimal transport, and neuroscience, these potential landscapes are learned as physically realistic partial differential equations.
Our method achieves both more qualitatively and quantitatively disentangled trajectories than state-of-the-art baselines.
arXiv Detail & Related papers (2023-04-25T15:53:45Z) - Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe.
GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z) - Recoding latent sentence representations -- Dynamic gradient-based
activation modification in RNNs [0.0]
In RNNs, encoding information in a suboptimal way can impact the quality of representations based on later elements in the sequence.
I propose an augmentation to standard RNNs in form of a gradient-based correction mechanism.
I conduct different experiments in the context of language modeling, where the impact of using such a mechanism is examined in detail.
arXiv Detail & Related papers (2021-01-03T17:54:17Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.