Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human
Activity Reasoning
- URL: http://arxiv.org/abs/2311.17365v1
- Date: Wed, 29 Nov 2023 05:27:14 GMT
- Title: Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human
Activity Reasoning
- Authors: Xiaoqian Wu, Yong-Lu Li, Jianhua Sun, Cewu Lu
- Abstract summary: We propose a new symbolic system with broad-coverage symbols and rational rules.
We leverage the recent advancement of LLMs as an approximation of the two ideal properties.
Our method shows superiority in extensive activity understanding tasks.
- Score: 58.5857133154749
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Human reasoning can be understood as a cooperation between the intuitive,
associative "System-1" and the deliberative, logical "System-2". For existing
System-1-like methods in visual activity understanding, it is crucial to
integrate System-2 processing to improve explainability, generalization, and
data efficiency. One possible path of activity reasoning is building a symbolic
system composed of symbols and rules, where one rule connects multiple symbols,
implying human knowledge and reasoning abilities. Previous methods have made
progress, but are defective with limited symbols from handcraft and limited
rules from visual-based annotations, failing to cover the complex patterns of
activities and lacking compositional generalization. To overcome the defects,
we propose a new symbolic system with two ideal important properties:
broad-coverage symbols and rational rules. Collecting massive human knowledge
via manual annotations is expensive to instantiate this symbolic system.
Instead, we leverage the recent advancement of LLMs (Large Language Models) as
an approximation of the two ideal properties, i.e., Symbols from Large Language
Models (Symbol-LLM). Then, given an image, visual contents from the images are
extracted and checked as symbols and activity semantics are reasoned out based
on rules via fuzzy logic calculation. Our method shows superiority in extensive
activity understanding tasks. Code and data are available at
https://mvig-rhos.com/symbol_llm.
Related papers
- Can Large Language Models Understand Symbolic Graphics Programs? [136.5639211254501]
Symbolic graphics programs are popular in computer graphics.
We create a benchmark for the semantic visual understanding of symbolic graphics programs.
We find that LLMs considered stronger at reasoning generally perform better.
arXiv Detail & Related papers (2024-08-15T17:59:57Z) - Take A Step Back: Rethinking the Two Stages in Visual Reasoning [57.16394309170051]
This paper revisits visual reasoning with a two-stage perspective.
It is more efficient to implement symbolization via separated encoders for different data domains while using a shared reasoner.
The proposed two-stage framework achieves impressive generalization ability on various visual reasoning tasks.
arXiv Detail & Related papers (2024-07-29T02:56:19Z) - Speak It Out: Solving Symbol-Related Problems with Symbol-to-Language
Conversion for Language Models [16.265409100706584]
Symbols play important roles in various tasks such as abstract reasoning, chemical property prediction, and table question answering.
Despite impressive natural language comprehension capabilities, large language models' reasoning abilities for symbols remain inadequate.
We propose symbol-to-language (S2L), a tuning-free method that enables large language models to solve symbol-related problems with information expressed in natural language.
arXiv Detail & Related papers (2024-01-22T07:07:06Z) - Symbol-LLM: Towards Foundational Symbol-centric Interface For Large
Language Models [41.91490484827197]
Injecting a collection of symbolic data directly into the training of Large Language Models can be problematic.
In this work, we tackle these challenges from both a data and framework perspective and introduce Symbol-LLM series models.
Extensive experiments on both symbol- and NL-centric tasks demonstrate the balanced and superior performances of Symbol-LLM series models.
arXiv Detail & Related papers (2023-11-15T18:59:56Z) - Generating by Understanding: Neural Visual Generation with Logical
Symbol Groundings [26.134405924834525]
We propose a neurosymbolic learning approach, Abductive visual Generation (AbdGen), for integrating logic programming systems with neural visual generative models.
Results show that compared to the baseline approaches, AbdGen requires significantly less labeled data for symbol assignment.
AbdGen can effectively learn underlying logical generative rules from data, which is out of the capability of existing approaches.
arXiv Detail & Related papers (2023-10-26T15:00:21Z) - Symbolic Visual Reinforcement Learning: A Scalable Framework with
Object-Level Abstraction and Differentiable Expression Search [63.3745291252038]
We propose DiffSES, a novel symbolic learning approach that discovers discrete symbolic policies.
By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions.
Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more scalable than state-of-the-art symbolic RL methods.
arXiv Detail & Related papers (2022-12-30T17:50:54Z) - Deep Symbolic Learning: Discovering Symbols and Rules from Perceptions [69.40242990198]
Neuro-Symbolic (NeSy) integration combines symbolic reasoning with Neural Networks (NNs) for tasks requiring perception and reasoning.
Most NeSy systems rely on continuous relaxation of logical knowledge, and no discrete decisions are made within the model pipeline.
We propose a NeSy system that learns NeSy-functions, i.e., the composition of a (set of) perception functions which map continuous data to discrete symbols, and a symbolic function over the set of symbols.
arXiv Detail & Related papers (2022-08-24T14:06:55Z) - pix2rule: End-to-end Neuro-symbolic Rule Learning [84.76439511271711]
This paper presents a complete neuro-symbolic method for processing images into objects, learning relations and logical rules.
The main contribution is a differentiable layer in a deep learning architecture from which symbolic relations and rules can be extracted.
We demonstrate that our model scales beyond state-of-the-art symbolic learners and outperforms deep relational neural network architectures.
arXiv Detail & Related papers (2021-06-14T15:19:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.