Related papers: RAM-SD: Retrieval-Augmented Multi-agent framework for Sarcasm Detection

RAM-SD: Retrieval-Augmented Multi-agent framework for Sarcasm Detection

URL: http://arxiv.org/abs/2601.17002v1
Date: Wed, 14 Jan 2026 03:19:40 GMT
Title: RAM-SD: Retrieval-Augmented Multi-agent framework for Sarcasm Detection
Authors: Ziyang Zhou, Ziqi Liu, Yan Wang, Yiming Lin, Yangbin Chen,
Abstract summary: RAM-SD is a Retrieval-Augmented Multi-Agent framework for Sarcasm Detection.<n>It operates through four stages: (1) contextual retrieval grounds the query in both sarcastic and non-sarcastic exemplars; (2) a meta-planner classifies the sarcasm type and selects an optimal reasoning plan from a predefined set; and (3) an ensemble of specialized agents performs complementary, multi-view analysis.<n> evaluated on four standard benchmarks, RAM-SD achieves a state-of-the-art Macro-F1 of 77.74%, outperforming the strong GPT-4o+CoC baseline by 7.01
Score: 17.814793753195723
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sarcasm detection remains a significant challenge due to its reliance on nuanced contextual understanding, world knowledge, and multi-faceted linguistic cues that vary substantially across different sarcastic expressions. Existing approaches, from fine-tuned transformers to large language models, apply a uniform reasoning strategy to all inputs, struggling to address the diverse analytical demands of sarcasm. These demands range from modeling contextual expectation violations to requiring external knowledge grounding or recognizing specific rhetorical patterns. To address this limitation, we introduce RAM-SD, a Retrieval-Augmented Multi-Agent framework for Sarcasm Detection. The framework operates through four stages: (1) contextual retrieval grounds the query in both sarcastic and non-sarcastic exemplars; (2) a meta-planner classifies the sarcasm type and selects an optimal reasoning plan from a predefined set; (3) an ensemble of specialized agents performs complementary, multi-view analysis; and (4) an integrator synthesizes these analyses into a final, interpretable judgment with a natural language explanation. Evaluated on four standard benchmarks, RAM-SD achieves a state-of-the-art Macro-F1 of 77.74%, outperforming the strong GPT-4o+CoC baseline by 7.01 points. Our framework not only sets a new performance benchmark but also provides transparent and interpretable reasoning traces, illuminating the cognitive processes behind sarcasm comprehension.

Related papers

Unifying Heterogeneous Multi-Modal Remote Sensing Detection Via Language-Pivoted Pretraining [59.2578488860426]
Heterogeneous multi-modal remote sensing object detection aims to accurately detect objects from diverse sensors.<n>Existing approaches largely adopt a late alignment paradigm, in which modality alignment and task-specific optimization are entangled during downstream fine-tuning.<n>We propose BabelRS, a unified language-pivoted pretraining framework that explicitly decouples modality alignment from downstream task learning.
arXiv Detail & Related papers (2026-03-02T11:38:12Z)
Connecting the Dots: Training-Free Visual Grounding via Agentic Reasoning [63.109585527799005]
GroundingAgent is a visual grounding framework that operates without task-specific fine-tuning.<n>It achieves an average zero-shot grounding accuracy of 65.1 % on widely-used benchmarks.<n>It also offers strong interpretability, transparently illustrating each reasoning step.
arXiv Detail & Related papers (2025-11-24T03:11:08Z)
SEVADE: Self-Evolving Multi-Agent Analysis with Decoupled Evaluation for Hallucination-Resistant Irony Detection [11.652782877761446]
We propose a novel **Self-**Ev**olving multi-agent **A**nalysis framework with **D**ecoupled **E**valuation for hallucination-resistant sarcasm detection.<n>Our framework achieves state-of-the-art performance, with average improvements of **6.75%** in Accuracy and **6.29%** in Macro-F1 score.
arXiv Detail & Related papers (2025-08-09T03:25:45Z)
PrismRAG: Boosting RAG Factuality with Distractor Resilience and Strategized Reasoning [57.89188317734747]
PrismRAG trains the model with distractor-aware QA pairs mixing gold evidence with subtle distractor passages.<n>It instills reasoning-centric habits that make the LLM plan, rationalize, and synthesize without relying on extensive human engineered instructions.
arXiv Detail & Related papers (2025-07-25T00:15:31Z)
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs [69.10441885629787]
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge.<n>It falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts.<n>This survey synthesizes both strands under a unified reasoning-retrieval perspective.
arXiv Detail & Related papers (2025-07-13T03:29:41Z)
IRONIC: Coherence-Aware Reasoning Chains for Multi-Modal Sarcasm Detection [7.739824004160998]
We present IRONIC, an in-context learning framework that leverages Multi-modal Coherence Relations to analyze referential, analogical and pragmatic image-text linkages.<n>Our experiments show that IRONIC achieves state-of-the-art performance on zero-shot Multi-modal Sarcasm Detection.
arXiv Detail & Related papers (2025-05-22T05:49:01Z)
Commander-GPT: Fully Unleashing the Sarcasm Detection Capability of Multi-Modal Large Language Models [10.47267683821842]
We propose an innovative multi-modal Commander-GPT framework for sarcasm detection.<n>Inspired by military strategy, we first decompose the sarcasm detection task into six distinct sub-tasks.<n>A central commander (decision-maker) then assigns the best-suited large language model to address each specific sub-task.<n>Our approach achieves state-of-the-art performance, with a 19.3% improvement in F1 score.
arXiv Detail & Related papers (2025-03-24T13:53:00Z)
Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation [69.01029651113386]
Embodied-RAG is a framework that enhances the model of an embodied agent with a non-parametric memory system.<n>At its core, Embodied-RAG's memory is structured as a semantic forest, storing language descriptions at varying levels of detail.<n>We demonstrate that Embodied-RAG effectively bridges RAG to the robotics domain, successfully handling over 250 explanation and navigation queries.
arXiv Detail & Related papers (2024-09-26T21:44:11Z)
SarcasmBench: Towards Evaluating Large Language Models on Sarcasm Understanding [19.412462224847086]
We present evaluations on six widely used benchmark datasets through different prompting approaches. GPT-4 consistently and significantly outperforms other LLMs across various prompting methods. Few-shot IO prompting method outperforms the other two methods: zero-shot IO and few-shot CoT.
arXiv Detail & Related papers (2024-08-21T03:59:51Z)
PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis [74.41260927676747]
This paper bridges the gaps by introducing a multimodal conversational Sentiment Analysis (ABSA) To benchmark the tasks, we construct PanoSent, a dataset annotated both manually and automatically, featuring high quality, large scale, multimodality, multilingualism, multi-scenarios, and covering both implicit and explicit sentiment elements. To effectively address the tasks, we devise a novel Chain-of-Sentiment reasoning framework, together with a novel multimodal large language model (namely Sentica) and a paraphrase-based verification mechanism.
arXiv Detail & Related papers (2024-08-18T13:51:01Z)
Multi-Modal Sarcasm Detection Based on Contrastive Attention Mechanism [7.194040730138362]
We construct a Contras-tive-Attention-based Sarcasm Detection (ConAttSD) model, which uses an inter-modality contrastive attention mechanism to extract contrastive features for an utterance. Our experiments on MUStARD, a benchmark multi-modal sarcasm dataset, demonstrate the effectiveness of the proposed ConAttSD model.
arXiv Detail & Related papers (2021-09-30T14:17:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.