Related papers: Symbolic Regression with a Learned Concept Library

Symbolic Regression with a Learned Concept Library

URL: http://arxiv.org/abs/2409.09359v2
Date: Thu, 31 Oct 2024 19:02:17 GMT
Title: Symbolic Regression with a Learned Concept Library
Authors: Arya Grayeli, Atharva Sehgal, Omar Costilla-Reyes, Miles Cranmer, Swarat Chaudhuri,
Abstract summary: We present a novel method for searching for compact programmatic hypotheses that best explain a dataset. Our algorithm, called LaSR, uses zero-shot queries to a large language model to discover and evolve concepts. LaSR substantially outperforms a variety of state-of-the-art SR approaches based on deep learning and evolutionary algorithms.
Score: 9.395222766576342
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a novel method for symbolic regression (SR), the task of searching for compact programmatic hypotheses that best explain a dataset. The problem is commonly solved using genetic algorithms; we show that we can enhance such methods by inducing a library of abstract textual concepts. Our algorithm, called LaSR, uses zero-shot queries to a large language model (LLM) to discover and evolve concepts occurring in known high-performing hypotheses. We discover new hypotheses using a mix of standard evolutionary steps and LLM-guided steps (obtained through zero-shot LLM queries) conditioned on discovered concepts. Once discovered, hypotheses are used in a new round of concept abstraction and evolution. We validate LaSR on the Feynman equations, a popular SR benchmark, as well as a set of synthetic tasks. On these benchmarks, LaSR substantially outperforms a variety of state-of-the-art SR approaches based on deep learning and evolutionary algorithms. Moreover, we show that LaSR can be used to discover a novel and powerful scaling law for LLMs.

Related papers

Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey [69.45421620616486]
This work presents the first structured taxonomy and analysis of discrete tokenization methods designed for large language models (LLMs)<n>We categorize 8 representative VQ variants that span classical and modern paradigms and analyze their algorithmic principles, training dynamics, and integration challenges with LLM pipelines.<n>We identify key challenges including codebook collapse, unstable gradient estimation, and modality-specific encoding constraints.
arXiv Detail & Related papers (2025-07-21T10:52:14Z)
Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [50.52694757593443]
Existing SAE training algorithms often lack rigorous mathematical guarantees and suffer from practical limitations.<n>We first propose a novel statistical framework for the feature recovery problem, which includes a new notion of feature identifiability.<n>We introduce a new SAE training algorithm based on bias adaptation'', a technique that adaptively adjusts neural network bias parameters to ensure appropriate activation sparsity.
arXiv Detail & Related papers (2025-06-16T20:58:05Z)
LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models [20.800445482814958]
Large Language Models (LLMs) have gained interest for their potential to leverage embedded scientific knowledge for hypothesis generation. Existing benchmarks often rely on common equations that are susceptible to memorization by LLMs, leading to inflated performance metrics that do not reflect discovery. In this paper, we introduce LLM-SRBench, a comprehensive benchmark with 239 challenging problems across four scientific domains. Our benchmark comprises two main categories: LSR-Transform, which transforms common physical models into less common mathematical representations to test reasoning beyond memorized forms, and LSR- Synth, which introduces synthetic, discovery-driven problems requiring data-driven reasoning
arXiv Detail & Related papers (2025-04-14T17:00:13Z)
LLM4ED: Large Language Models for Automatic Equation Discovery [0.8644909837301149]
We introduce a new framework that utilizes natural language-based prompts to guide large language models in automatically mining governing equations from data. Specifically, we first utilize the generation capability of LLMs to generate diverse equations in string form, and then evaluate the generated equations based on observations. Experiments are extensively conducted on both partial differential equations and ordinary differential equations.
arXiv Detail & Related papers (2024-05-13T14:03:49Z)
In-Context Symbolic Regression: Leveraging Large Language Models for Function Discovery [5.2387832710686695]
In this work, we introduce the first comprehensive framework that utilizes Large Language Models (LLMs) for the task of Symbolic Regression. We propose In-Context Symbolic Regression (ICSR), an SR method which iteratively refines a functional form with an external LLM and determines its coefficients with an external LLM. Our findings reveal that LLMs are able to successfully find symbolic equations that fit the given data, matching or outperforming the overall performance of the best SR baselines on four popular benchmarks.
arXiv Detail & Related papers (2024-04-29T20:19:25Z)
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models [17.64574496035502]
Traditional methods of equation discovery, known as symbolic regression, largely focus on extracting equations from data alone. We introduce LLM-SR, a novel approach that leverages the scientific knowledge and robust code generation capabilities of Large Language Models. We demonstrate LLM-SR's effectiveness across three diverse scientific domains, where it discovers physically accurate equations.
arXiv Detail & Related papers (2024-04-29T03:30:06Z)
Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal [49.24054920683246]
Large language models (LLMs) suffer from catastrophic forgetting during continual learning. We propose a framework called Self-Synthesized Rehearsal (SSR) that uses the LLM to generate synthetic instances for rehearsal.
arXiv Detail & Related papers (2024-03-02T16:11:23Z)
Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control [66.78146440275093]
Learned retrieval (LSR) is a family of neural methods that encode queries and documents into sparse lexical vectors. We explore the application of LSR to the multi-modal domain, with a focus on text-image retrieval. Current approaches like LexLIP and STAIR require complex multi-step training on massive datasets. Our proposed approach efficiently transforms dense vectors from a frozen dense model into sparse lexical vectors.
arXiv Detail & Related papers (2024-02-27T14:21:56Z)
How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback. Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities. We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z)
Synergistic Interplay between Search and Large Language Models for Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections. InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z)
Learning Neural Network Quantum States with the Linear Method [0.0]
We show that the linear method can be used successfully for the optimization of complex valued neural network quantum states. We compare the LM to the state-of-the-art SR algorithm and find that the LM requires up to an order of magnitude fewer iterations for convergence.
arXiv Detail & Related papers (2021-04-22T12:18:33Z)
Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks. We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator. To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
Towards Understanding Label Smoothing [36.54164997035046]
Label smoothing regularization (LSR) has a great success in deep neural networks by training algorithms. We show that an appropriate LSR can help to speed up convergence by reducing the variance. We propose a simple yet effective strategy, namely Two-Stage LAbel smoothing algorithm (TSLA)
arXiv Detail & Related papers (2020-06-20T20:36:17Z)
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning [134.77207192945053]
Prior methods learn the neural-symbolic models using reinforcement learning approaches. We introduce the textbfgrammar model as a textitsymbolic prior to bridge neural perception and symbolic reasoning. We propose a novel textbfback-search algorithm which mimics the top-down human-like learning procedure to propagate the error.
arXiv Detail & Related papers (2020-06-11T17:42:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.