LLM-Based Scientific Equation Discovery via Physics-Informed Token-Regularized Policy Optimization
- URL: http://arxiv.org/abs/2602.10576v1
- Date: Wed, 11 Feb 2026 07:02:23 GMT
- Title: LLM-Based Scientific Equation Discovery via Physics-Informed Token-Regularized Policy Optimization
- Authors: Boxiao Wang, Kai Li, Tianyi Liu, Chen Li, Junzhe Wang, Yifan Zhang, Jian Cheng,
- Abstract summary: PiT-PO is a unified framework that evolves the Large Language Models into an adaptive generator via reinforcement learning.<n>Central to PiT-PO is a dual-constraint mechanism that rigorously enforces hierarchical physical validity while simultaneously applying fine-grained, token-level penalties to suppress redundant structures.<n> Empirically, PiT-PO achieves state-of-the-art performance on standard benchmarks and successfully discovers novel turbulence models for challenging fluid dynamics problems.
- Score: 32.24464649397858
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Symbolic regression aims to distill mathematical equations from observational data. Recent approaches have successfully leveraged Large Language Models (LLMs) to generate equation hypotheses, capitalizing on their vast pre-trained scientific priors. However, existing frameworks predominantly treat the LLM as a static generator, relying on prompt-level guidance to steer exploration. This paradigm fails to update the model's internal representations based on search feedback, often yielding physically inconsistent or mathematically redundant expressions. In this work, we propose PiT-PO (Physics-informed Token-regularized Policy Optimization), a unified framework that evolves the LLM into an adaptive generator via reinforcement learning. Central to PiT-PO is a dual-constraint mechanism that rigorously enforces hierarchical physical validity while simultaneously applying fine-grained, token-level penalties to suppress redundant structures. Consequently, PiT-PO aligns LLM to produce equations that are both scientifically consistent and structurally parsimonious. Empirically, PiT-PO achieves state-of-the-art performance on standard benchmarks and successfully discovers novel turbulence models for challenging fluid dynamics problems. We also demonstrate that PiT-PO empowers small-scale models to outperform closed-source giants, democratizing access to high-performance scientific discovery.
Related papers
- Grounding LLMs in Scientific Discovery via Embodied Actions [84.11877211907647]
Large Language Models (LLMs) have shown significant potential in scientific discovery but struggle to bridge the gap between theoretical reasoning and physical simulation.<n>We propose EmbodiedAct, a framework that transforms established scientific software into active embodied agents by groundings in embodied actions with a tight perception-execution loop.
arXiv Detail & Related papers (2026-02-24T07:37:18Z) - CoT-Evo: Evolutionary Distillation of Chain-of-Thought for Scientific Reasoning [63.44477226386808]
Chain-of-thought (CoT) distillation from advanced large language models (LLMs) has proven effective in general reasoning tasks.<n>But it struggles in scientific domains where even advanced models often produce incorrect or superficial reasoning.<n>We propose CoT-Evo, an evolutionary CoT distillation framework to overcome this problem.
arXiv Detail & Related papers (2025-10-15T05:29:56Z) - Uncalibrated Reasoning: GRPO Induces Overconfidence for Stochastic Outcomes [55.2480439325792]
Reinforcement learning (RL) has proven remarkably effective at improving the accuracy of language models in verifiable and deterministic domains like mathematics.<n>Here, we examine if current RL methods are also effective at optimizing language models in verifiable domains with outcomes, like scientific experiments.
arXiv Detail & Related papers (2025-08-15T20:50:53Z) - Reasoning through Exploration: A Reinforcement Learning Framework for Robust Function Calling [35.97270347306353]
We propose textbfEGPO, a new RL framework built upon Group Relative Policy Optimization (GRPO)<n>The core of EGPO is an entropy-enhanced advantage function that integrates the entropy of the model's Chain-of-Thought (CoT) into the policy gradient.<n>On the challenging Berkeley Function Calling Leaderboard (BFCL), a 4B- parameter model trained with EGPO sets a new state-of-the-art among models of comparable size.
arXiv Detail & Related papers (2025-08-07T07:51:38Z) - Potential failures of physics-informed machine learning in traffic flow modeling: theoretical and experimental analysis [5.055539099879598]
This study investigates why physics-informed machine learning (PIML) can fail in macroscopic traffic flow modeling.<n>We define failure as cases where a PIML model underperforms both purely data-driven and purely physics-based baselines by a given threshold.<n>This explains why LWR-based PIML can outperform ARZ-based PIML even with high-resolution data, with the gap shrinking as resolution increases.
arXiv Detail & Related papers (2025-05-16T17:55:06Z) - Physics-Informed Inference Time Scaling via Simulation-Calibrated Scientific Machine Learning [5.728698570173857]
High-dimensional partial differential equations (PDEs) pose significant computational challenges across fields ranging from quantum chemistry to economics and finance.<n>Although scientific machine learning (SciML) techniques offer approximate solutions, they often suffer from bias and neglect crucial physical insights.<n>We propose Simulation-Calibrated Scientific Machine Learning (SCa), a framework that dynamically refines and debiases the SCiML predictions during inference by enforcing the physical laws.
arXiv Detail & Related papers (2025-04-22T18:01:45Z) - DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [86.76714527437383]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks.<n>We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge.<n>Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z) - AutoTurb: Using Large Language Models for Automatic Algebraic Model Discovery of Turbulence Closure [15.905369652489505]
In this work, a novel framework using LLMs to automatically discover expressions for correcting the Reynolds stress model is proposed.
The proposed method is performed for separated flow over periodic hills at Re = 10,595.
It is demonstrated that the corrective RANS can improve the prediction for both the Reynolds stress and mean velocity fields.
arXiv Detail & Related papers (2024-10-14T16:06:35Z) - Physics Informed Deep Learning for Strain Gradient Continuum Plasticity [0.0]
We use a space-time discretization based on physics informed deep learning to approximate solutions of rate-dependent strain gradient plasticity models.
Taking inspiration from physics informed neural networks, we modify the loss function of a PIDL model in several novel ways.
We show how PIDL methods could address the computational challenges posed by strain plasticity models.
arXiv Detail & Related papers (2024-08-13T06:02:05Z) - Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks.
We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level.
We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.