Symbol-LLM: Towards Foundational Symbol-centric Interface For Large
Language Models
- URL: http://arxiv.org/abs/2311.09278v2
- Date: Sun, 18 Feb 2024 06:24:12 GMT
- Title: Symbol-LLM: Towards Foundational Symbol-centric Interface For Large
Language Models
- Authors: Fangzhi Xu, Zhiyong Wu, Qiushi Sun, Siyu Ren, Fei Yuan, Shuai Yuan,
Qika Lin, Yu Qiao, Jun Liu
- Abstract summary: Injecting a collection of symbolic data directly into the training of Large Language Models can be problematic.
In this work, we tackle these challenges from both a data and framework perspective and introduce Symbol-LLM series models.
Extensive experiments on both symbol- and NL-centric tasks demonstrate the balanced and superior performances of Symbol-LLM series models.
- Score: 41.91490484827197
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although Large Language Models (LLMs) demonstrate remarkable ability in
processing and generating human-like text, they do have limitations when it
comes to comprehending and expressing world knowledge that extends beyond the
boundaries of natural language(e.g., chemical molecular formula). Injecting a
collection of symbolic data directly into the training of LLMs can be
problematic, as it disregards the synergies among different symbolic families
and overlooks the need for a balanced mixture of natural and symbolic data. In
this work, we tackle these challenges from both a data and framework
perspective and introduce Symbol-LLM series models. First, we curated a data
collection consisting of 34 tasks and incorporating approximately 20 distinct
symbolic families, intending to capture the interrelations and foster synergies
between symbols. Then, a two-stage tuning framework succeeds in injecting
symbolic knowledge without loss of the generality ability. Extensive
experiments on both symbol- and NL-centric tasks demonstrate the balanced and
superior performances of Symbol-LLM series models. The project page is
https://xufangzhi.github.io/symbol-llm-page/.
Related papers
- Investigating Symbolic Capabilities of Large Language Models [16.88906206735967]
This study aims to bridge the gap by rigorously evaluating Large Language Models (LLMs) on a series of symbolic tasks.
Our analysis encompasses eight LLMs, including four enterprise-grade and four open-source models, of which three have been pre-trained on mathematical tasks.
The findings reveal a significant decline in LLMs' performance on context-free and context-sensitive symbolic tasks as the complexity, represented by the number of symbols, increases.
arXiv Detail & Related papers (2024-05-21T21:24:34Z) - MALTO at SemEval-2024 Task 6: Leveraging Synthetic Data for LLM
Hallucination Detection [3.049887057143419]
In Natural Language Generation (NLG), contemporary Large Language Models (LLMs) face several challenges.
This often leads to neural networks exhibiting "hallucinations"
The SHROOM challenge focuses on automatically identifying these hallucinations in the generated text.
arXiv Detail & Related papers (2024-03-01T20:31:10Z) - The Role of Foundation Models in Neuro-Symbolic Learning and Reasoning [54.56905063752427]
Neuro-Symbolic AI (NeSy) holds promise to ensure the safe deployment of AI systems.
Existing pipelines that train the neural and symbolic components sequentially require extensive labelling.
New architecture, NeSyGPT, fine-tunes a vision-language foundation model to extract symbolic features from raw data.
arXiv Detail & Related papers (2024-02-02T20:33:14Z) - Speak It Out: Solving Symbol-Related Problems with Symbol-to-Language
Conversion for Language Models [16.265409100706584]
Symbols play important roles in various tasks such as abstract reasoning, chemical property prediction, and table question answering.
Despite impressive natural language comprehension capabilities, large language models' reasoning abilities for symbols remain inadequate.
We propose symbol-to-language (S2L), a tuning-free method that enables large language models to solve symbol-related problems with information expressed in natural language.
arXiv Detail & Related papers (2024-01-22T07:07:06Z) - Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human
Activity Reasoning [58.5857133154749]
We propose a new symbolic system with broad-coverage symbols and rational rules.
We leverage the recent advancement of LLMs as an approximation of the two ideal properties.
Our method shows superiority in extensive activity understanding tasks.
arXiv Detail & Related papers (2023-11-29T05:27:14Z) - Semantic Graph Representation Learning for Handwritten Mathematical
Expression Recognition [57.60390958736775]
We propose a simple but efficient method to enhance semantic interaction learning (SIL)
We first construct a semantic graph based on the statistical symbol co-occurrence probabilities.
Then we design a semantic aware module (SAM), which projects the visual and classification feature into semantic space.
Our method achieves better recognition performance than prior arts on both CROHME and HME100K datasets.
arXiv Detail & Related papers (2023-08-21T06:23:41Z) - Symbolic Visual Reinforcement Learning: A Scalable Framework with
Object-Level Abstraction and Differentiable Expression Search [63.3745291252038]
We propose DiffSES, a novel symbolic learning approach that discovers discrete symbolic policies.
By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions.
Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more scalable than state-of-the-art symbolic RL methods.
arXiv Detail & Related papers (2022-12-30T17:50:54Z) - Hidden Schema Networks [3.4123736336071864]
We introduce a novel neural language model that enforces, via inductive biases, explicit relational structures.
The model encodes sentences into sequences of symbols, which correspond to nodes visited by biased random walkers.
We show that the model is able to uncover ground-truth graphs from artificially generated datasets of random token sequences.
arXiv Detail & Related papers (2022-07-08T09:26:19Z) - Drawing out of Distribution with Neuro-Symbolic Generative Models [49.79371715591122]
Drawing out of Distribution is a neuro-symbolic generative model of stroke-based drawing.
DooD operates directly on images, requires no supervision or expensive test-time inference.
We evaluate DooD on its ability to generalise across both data and tasks.
arXiv Detail & Related papers (2022-06-03T21:40:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.