Related papers: Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models

Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models

URL: http://arxiv.org/abs/2311.09278v2
Date: Sun, 18 Feb 2024 06:24:12 GMT
Title: Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models
Authors: Fangzhi Xu, Zhiyong Wu, Qiushi Sun, Siyu Ren, Fei Yuan, Shuai Yuan, Qika Lin, Yu Qiao, Jun Liu
Abstract summary: Injecting a collection of symbolic data directly into the training of Large Language Models can be problematic. In this work, we tackle these challenges from both a data and framework perspective and introduce Symbol-LLM series models. Extensive experiments on both symbol- and NL-centric tasks demonstrate the balanced and superior performances of Symbol-LLM series models.
Score: 41.91490484827197
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Although Large Language Models (LLMs) demonstrate remarkable ability in processing and generating human-like text, they do have limitations when it comes to comprehending and expressing world knowledge that extends beyond the boundaries of natural language(e.g., chemical molecular formula). Injecting a collection of symbolic data directly into the training of LLMs can be problematic, as it disregards the synergies among different symbolic families and overlooks the need for a balanced mixture of natural and symbolic data. In this work, we tackle these challenges from both a data and framework perspective and introduce Symbol-LLM series models. First, we curated a data collection consisting of 34 tasks and incorporating approximately 20 distinct symbolic families, intending to capture the interrelations and foster synergies between symbols. Then, a two-stage tuning framework succeeds in injecting symbolic knowledge without loss of the generality ability. Extensive experiments on both symbol- and NL-centric tasks demonstrate the balanced and superior performances of Symbol-LLM series models. The project page is https://xufangzhi.github.io/symbol-llm-page/.

Related papers

Can Large Language Models Adequately Perform Symbolic Reasoning Over Time Series? [10.185545951475104]
We introduce SymbolBench, a benchmark to assess symbolic reasoning over real-world time series.<n>Unlike prior efforts, SymbolBench spans a diverse set of symbolic forms with varying complexity.<n>We propose a unified framework that integrates Large Language Models with genetic programming to form a closed-loop symbolic reasoning system.
arXiv Detail & Related papers (2025-08-05T22:58:54Z)
Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition [16.68658893305642]
Handwritten Mathematical Expression Recognition (HMER) remains a persistent challenge in Optical Character Recognition (OCR)<n>We introduce Uni-MuMER, which fully fine-tunes a vision-language model for the HMER task without modifying its architecture.<n>Our method integrates three data-driven tasks: Tree-Aware Chain-of-Thought (Tree-CoT) for structured spatial reasoning, Error-Driven Learning (EDL) for reducing confusion among visually similar characters, and Symbol Counting (SC) for improving recognition consistency in long expressions.
arXiv Detail & Related papers (2025-05-29T15:41:00Z)
Extracting Symbolic Sequences from Visual Representations via Self-Supervised Learning [0.0]
We propose a novel approach for generating symbolic representations from visual data using self-supervised learning (SSL) An advantage of our method is its interpretability: the sequences are produced by a decoder transformer using cross-attention. This approach lays the foundation for creating interpretable symbolic representations with potential applications in high-level scene understanding.
arXiv Detail & Related papers (2025-03-06T19:02:20Z)
Improving Chain-of-Thought Reasoning via Quasi-Symbolic Abstractions [45.950841507164064]
Chain-of-Though (CoT) represents a common strategy for reasoning in Large Language Models. We present QuaSAR, a variation of CoT that guides LLMs to operate at a higher level of abstraction via quasi-symbolic explanations. Our experiments show that quasi-symbolic abstractions can improve CoT-based methods by up to 8% accuracy.
arXiv Detail & Related papers (2025-02-18T07:58:48Z)
Investigating Symbolic Capabilities of Large Language Models [16.88906206735967]
This study aims to bridge the gap by rigorously evaluating Large Language Models (LLMs) on a series of symbolic tasks. Our analysis encompasses eight LLMs, including four enterprise-grade and four open-source models, of which three have been pre-trained on mathematical tasks. The findings reveal a significant decline in LLMs' performance on context-free and context-sensitive symbolic tasks as the complexity, represented by the number of symbols, increases.
arXiv Detail & Related papers (2024-05-21T21:24:34Z)
MALTO at SemEval-2024 Task 6: Leveraging Synthetic Data for LLM Hallucination Detection [3.049887057143419]
In Natural Language Generation (NLG), contemporary Large Language Models (LLMs) face several challenges. This often leads to neural networks exhibiting "hallucinations" The SHROOM challenge focuses on automatically identifying these hallucinations in the generated text.
arXiv Detail & Related papers (2024-03-01T20:31:10Z)
The Role of Foundation Models in Neuro-Symbolic Learning and Reasoning [54.56905063752427]
Neuro-Symbolic AI (NeSy) holds promise to ensure the safe deployment of AI systems. Existing pipelines that train the neural and symbolic components sequentially require extensive labelling. New architecture, NeSyGPT, fine-tunes a vision-language foundation model to extract symbolic features from raw data.
arXiv Detail & Related papers (2024-02-02T20:33:14Z)
Speak It Out: Solving Symbol-Related Problems with Symbol-to-Language Conversion for Language Models [16.265409100706584]
Symbols play important roles in various tasks such as abstract reasoning, chemical property prediction, and table question answering. Despite impressive natural language comprehension capabilities, large language models' reasoning abilities for symbols remain inadequate. We propose symbol-to-language (S2L), a tuning-free method that enables large language models to solve symbol-related problems with information expressed in natural language.
arXiv Detail & Related papers (2024-01-22T07:07:06Z)
Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning [58.5857133154749]
We propose a new symbolic system with broad-coverage symbols and rational rules. We leverage the recent advancement of LLMs as an approximation of the two ideal properties. Our method shows superiority in extensive activity understanding tasks.
arXiv Detail & Related papers (2023-11-29T05:27:14Z)
Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition [57.60390958736775]
We propose a simple but efficient method to enhance semantic interaction learning (SIL) We first construct a semantic graph based on the statistical symbol co-occurrence probabilities. Then we design a semantic aware module (SAM), which projects the visual and classification feature into semantic space. Our method achieves better recognition performance than prior arts on both CROHME and HME100K datasets.
arXiv Detail & Related papers (2023-08-21T06:23:41Z)
Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search [63.3745291252038]
We propose DiffSES, a novel symbolic learning approach that discovers discrete symbolic policies. By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions. Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more scalable than state-of-the-art symbolic RL methods.
arXiv Detail & Related papers (2022-12-30T17:50:54Z)
Hidden Schema Networks [3.4123736336071864]
We introduce a novel neural language model that enforces, via inductive biases, explicit relational structures. The model encodes sentences into sequences of symbols, which correspond to nodes visited by biased random walkers. We show that the model is able to uncover ground-truth graphs from artificially generated datasets of random token sequences.
arXiv Detail & Related papers (2022-07-08T09:26:19Z)
Drawing out of Distribution with Neuro-Symbolic Generative Models [49.79371715591122]
Drawing out of Distribution is a neuro-symbolic generative model of stroke-based drawing. DooD operates directly on images, requires no supervision or expensive test-time inference. We evaluate DooD on its ability to generalise across both data and tasks.
arXiv Detail & Related papers (2022-06-03T21:40:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.