Related papers: LLM-Guided Semantic Relational Reasoning for Multimodal Intent Recognition

LLM-Guided Semantic Relational Reasoning for Multimodal Intent Recognition

URL: http://arxiv.org/abs/2509.01337v1
Date: Mon, 01 Sep 2025 10:18:47 GMT
Title: LLM-Guided Semantic Relational Reasoning for Multimodal Intent Recognition
Authors: Qianrui Zhou, Hua Xu, Yifan Wang, Xinzhi Dong, Hanlei Zhang,
Abstract summary: This paper proposes a novel method for understanding human intents from multimodal signals.<n>The method harnesses the expansive knowledge of large language models (LLMs) to establish semantic foundations.<n>Experiments on multimodal intent and dialogue act tasks demonstrate LGSRR's superiority over state-of-the-art methods.
Score: 14.683883775425821
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding human intents from multimodal signals is critical for analyzing human behaviors and enhancing human-machine interactions in real-world scenarios. However, existing methods exhibit limitations in their modality-level reliance, constraining relational reasoning over fine-grained semantics for complex intent understanding. This paper proposes a novel LLM-Guided Semantic Relational Reasoning (LGSRR) method, which harnesses the expansive knowledge of large language models (LLMs) to establish semantic foundations that boost smaller models' relational reasoning performance. Specifically, an LLM-based strategy is proposed to extract fine-grained semantics as guidance for subsequent reasoning, driven by a shallow-to-deep Chain-of-Thought (CoT) that autonomously uncovers, describes, and ranks semantic cues by their importance without relying on manually defined priors. Besides, we formally model three fundamental types of semantic relations grounded in logical principles and analyze their nuanced interplay to enable more effective relational reasoning. Extensive experiments on multimodal intent and dialogue act recognition tasks demonstrate LGSRR's superiority over state-of-the-art methods, with consistent performance gains across diverse semantic understanding scenarios. The complete data and code are available at https://github.com/thuiar/LGSRR.

Related papers

Fundamental Reasoning Paradigms Induce Out-of-Domain Generalization in Language Models [43.76842321707181]
In this study, we shed light on how the interplay between these core paradigms influences Large Language Model (LLM) reasoning.<n>We first collect a new dataset of reasoning trajectories from symbolic tasks, each targeting one of the three fundamental paradigms.<n>We then investigate effective ways for inducing these skills into LLMs.
arXiv Detail & Related papers (2026-02-09T13:51:48Z)
Concept Component Analysis: A Principled Approach for Concept Extraction in LLMs [51.378834857406325]
Mechanistic interpretability seeks to mitigate the issues through extracts from large language models.<n>Sparse autoencoders (SAEs) have emerged as a popular approach for extracting interpretable and monosemantic concepts.<n>We show that SAEs suffer from a fundamental theoretical ambiguity: the well-defined correspondence between LLM representations and human-interpretable concepts remains unclear.
arXiv Detail & Related papers (2026-01-28T09:27:05Z)
Multi-Path Collaborative Reasoning via Reinforcement Learning [54.8518809800168]
Chain-of-Thought (CoT) reasoning has significantly advanced the problem-solving capabilities of Large Language Models (LLMs)<n>Recent methods attempt to address this by generating soft abstract tokens to enable reasoning in a continuous semantic space.<n>We propose Multi-Path Perception Policy Optimization (M3PO), a novel reinforcement learning framework that explicitly injects collective insights into the reasoning process.
arXiv Detail & Related papers (2025-12-01T10:05:46Z)
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning [63.25540801694765]
Large Language Models (LLMs) demonstrate striking linguistic abilities, yet whether they achieve this same balance remains unclear.<n>We apply the Information Bottleneck principle to quantitatively compare how LLMs and humans navigate this compression-meaning trade-off.
arXiv Detail & Related papers (2025-05-21T16:29:00Z)
Boosting Neural Language Inference via Cascaded Interactive Reasoning [38.125341836302525]
Natural Language Inference (NLI) focuses on ascertaining the logical relationship between a given premise and hypothesis.<n>This task presents significant challenges due to inherent linguistic features such as diverse phrasing, semantic complexity, and contextual nuances.<n>We introduce the Cascaded Interactive Reasoning Network (CIRN), a novel architecture designed for deeper semantic comprehension in NLI.
arXiv Detail & Related papers (2025-05-10T11:37:15Z)
Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1) [66.51642638034822]
Reasoning is central to human intelligence, enabling structured problem-solving across diverse tasks.<n>Recent advances in large language models (LLMs) have greatly enhanced their reasoning abilities in arithmetic, commonsense, and symbolic domains.<n>This paper offers a concise yet insightful overview of reasoning techniques in both textual and multimodal LLMs.
arXiv Detail & Related papers (2025-04-04T04:04:56Z)
Semantic Mastery: Enhancing LLMs with Advanced Natural Language Understanding [0.0]
The paper discusses state-of-the-art methodologies that advance large language models (LLMs) with more advanced NLU techniques.<n>We analyze the use of structured knowledge graphs, retrieval-augmented generation (RAG), and fine-tuning strategies that match models with human-level understanding.
arXiv Detail & Related papers (2025-04-01T04:12:04Z)
EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic Integration [60.47645731801866]
Large language models (LLMs) are increasingly leveraged as foundational backbones in advanced recommender systems.<n>LLMs are pre-trained linguistic semantics but learn collaborative semantics from scratch via the llm-Backbone.<n>We propose EAGER-LLM, a decoder-only generative recommendation framework that integrates endogenous and endogenous behavioral and semantic information in a non-intrusive manner.
arXiv Detail & Related papers (2025-02-20T17:01:57Z)
LogiDynamics: Unraveling the Dynamics of Logical Inference in Large Language Model Reasoning [74.0242521818214]
This paper adopts an exploratory approach by introducing a controlled evaluation environment for analogical reasoning.<n>We analyze the comparative dynamics of inductive, abductive, and deductive inference pipelines.<n>We investigate advanced paradigms such as hypothesis selection, verification, and refinement, revealing their potential to scale up logical inference.
arXiv Detail & Related papers (2025-02-16T15:54:53Z)
Do Large Language Models Advocate for Inferentialism? [0.0]
The emergence of large language models (LLMs) such as ChatGPT and Claude presents new challenges for philosophy of language.<n>This paper explores Robert Brandom's inferential semantics as an alternative foundational framework for understanding these systems.
arXiv Detail & Related papers (2024-12-19T03:48:40Z)
Prompt-based Logical Semantics Enhancement for Implicit Discourse Relation Recognition [4.7938839332508945]
We propose a Prompt-based Logical Semantics Enhancement (PLSE) method for Implicit Discourse Relation Recognition (IDRR) Our method seamlessly injects knowledge relevant to discourse relation into pre-trained language models through prompt-based connective prediction. Experimental results on PDTB 2.0 and CoNLL16 datasets demonstrate that our method achieves outstanding and consistent performance against the current state-of-the-art models.
arXiv Detail & Related papers (2023-11-01T08:38:08Z)
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning. We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z)
Re-Reading Improves Reasoning in Large Language Models [87.46256176508376]
We introduce a simple, yet general and effective prompting method, Re2, to enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs) Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process. We evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality.
arXiv Detail & Related papers (2023-09-12T14:36:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.