MLLM-SR: Conversational Symbolic Regression base Multi-Modal Large Language Models
- URL: http://arxiv.org/abs/2406.05410v1
- Date: Sat, 8 Jun 2024 09:17:54 GMT
- Title: MLLM-SR: Conversational Symbolic Regression base Multi-Modal Large Language Models
- Authors: Yanjie Li, Weijun Li, Lina Yu, Min Wu, Jingyi Liu, Wenqiang Li, Shu Wei, Yusong Deng,
- Abstract summary: MLLM-SR is a conversational symbolic regression method that can generate expressions that meet the requirements simply by describing the requirements with natural language instructions.
We experimentally demonstrate that MLLM-SR can well understand the prior knowledge we add to the natural language instructions.
- Score: 13.136507215114722
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Formulas are the language of communication between humans and nature. It is an important research topic of artificial intelligence to find expressions from observed data to reflect the relationship between each variable in the data, which is called a symbolic regression problem. The existing symbolic regression methods directly generate expressions according to the given observation data, and we cannot require the algorithm to generate expressions that meet specific requirements according to the known prior knowledge. For example, the expression needs to contain $\sin$ or be symmetric, and so on. Even if it can, it often requires very complex operations, which is very inconvenient. In this paper, based on multi-modal large language models, we propose MLLM-SR, a conversational symbolic regression method that can generate expressions that meet the requirements simply by describing the requirements with natural language instructions. By experimenting on the Nguyen dataset, we can demonstrate that MLLM-SR leads the state-of-the-art baselines in fitting performance. More notably, we experimentally demonstrate that MLLM-SR can well understand the prior knowledge we add to the natural language instructions. Moreover, the addition of prior knowledge can effectively guide MLLM-SR to generate correct expressions.
Related papers
- Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML)
VML constrains the parameter space to be human-interpretable natural language.
We conduct several studies to empirically evaluate the effectiveness of VML.
arXiv Detail & Related papers (2024-06-06T17:59:56Z) - What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages [78.1866280652834]
Large language models (LM) are distributions over strings.
We investigate the learnability of regular LMs (RLMs) by RNN and Transformer LMs.
We find that the complexity of the RLM rank is strong and significant predictors of learnability for both RNNs and Transformers.
arXiv Detail & Related papers (2024-06-06T17:34:24Z) - In-Context Symbolic Regression: Leveraging Large Language Models for Function Discovery [5.2387832710686695]
In this work, we introduce the first comprehensive framework that utilizes Large Language Models (LLMs) for the task of Symbolic Regression.
We propose In-Context Symbolic Regression (ICSR), an SR method which iteratively refines a functional form with an external LLM and determines its coefficients with an external LLM.
Our findings reveal that LLMs are able to successfully find symbolic equations that fit the given data, matching or outperforming the overall performance of the best SR baselines on four popular benchmarks.
arXiv Detail & Related papers (2024-04-29T20:19:25Z) - LLM-SR: Scientific Equation Discovery via Programming with Large Language Models [17.64574496035502]
Traditional methods of equation discovery, known as symbolic regression, largely focus on extracting equations from data alone.
We introduce LLM-SR, a novel approach that leverages the scientific knowledge and robust code generation capabilities of Large Language Models.
We demonstrate LLM-SR's effectiveness across three diverse scientific domains, where it discovers physically accurate equations.
arXiv Detail & Related papers (2024-04-29T03:30:06Z) - Linearity of Relation Decoding in Transformer Language Models [82.47019600662874]
Much of the knowledge encoded in transformer language models (LMs) may be expressed in terms of relations.
We show that, for a subset of relations, this computation is well-approximated by a single linear transformation on the subject representation.
arXiv Detail & Related papers (2023-08-17T17:59:19Z) - Explaining Emergent In-Context Learning as Kernel Regression [61.57151500616111]
Large language models (LLMs) have initiated a paradigm shift in transfer learning.
In this paper, we investigate the reason why a transformer-based language model can accomplish in-context learning after pre-training.
We find that during ICL, the attention and hidden features in LLMs match the behaviors of a kernel regression.
arXiv Detail & Related papers (2023-05-22T06:45:02Z) - Augmented Language Models: a Survey [55.965967655575454]
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools.
We refer to them as Augmented Language Models (ALMs)
The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks.
arXiv Detail & Related papers (2023-02-15T18:25:52Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.