Related papers: LLM-Consensus: Multi-Agent Debate for Visual Misinformation Detection

Related papers

CMIE: Combining MLLM Insights with External Evidence for Explainable Out-of-Context Misinformation Detection [4.506980868306549]
We propose CMIE, a novel framework for detecting out-of-context (OOC) misinformation.<n>CMIE identifies the underlying coexistence between images and text, and selectively utilizes relevant evidence to enhance misinformation detection.
arXiv Detail & Related papers (2025-05-29T13:56:21Z)
IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis [60.32962597618861]
IDA-Bench is a novel benchmark evaluating large language models in multi-round interactive scenarios.<n>Agent performance is judged by comparing its final numerical output to the human-derived baseline.<n>Even state-of-the-art coding agents (like Claude-3.7-thinking) succeed on 50% of the tasks, highlighting limitations not evident in single-turn tests.
arXiv Detail & Related papers (2025-05-23T09:37:52Z)
Bridging Expertise Gaps: The Role of LLMs in Human-AI Collaboration for Cybersecurity [17.780795900414716]
This study investigates whether large language models (LLMs) can function as intelligent collaborators to bridge expertise gaps in cybersecurity decision-making.<n>We find that human-AI collaboration improves task performance,reducing false positives in phishing detection and false negatives in intrusion detection.
arXiv Detail & Related papers (2025-05-06T04:47:52Z)
TAMO:Fine-Grained Root Cause Analysis via Tool-Assisted LLM Agent with Multi-Modality Observation Data [33.5606443790794]
Large language models (LLMs) have made breakthroughs in contextual inference and domain knowledge integration. We propose a tool-assisted LLM agent with multi-modality observation data, namely TAMO, for fine-grained root cause analysis.
arXiv Detail & Related papers (2025-04-29T06:50:48Z)
Knowledge-Aware Iterative Retrieval for Multi-Agent Systems [0.0]
We introduce a novel large language model (LLM)-driven agent framework. It iteratively refines queries and filters contextual evidence by leveraging dynamically evolving knowledge. The proposed system supports both competitive and collaborative sharing of updated context.
arXiv Detail & Related papers (2025-03-17T15:27:02Z)
CRAT: A Multi-Agent Framework for Causality-Enhanced Reflective and Retrieval-Augmented Translation with Large Language Models [59.8529196670565]
CRAT is a novel multi-agent translation framework that leverages RAG and causality-enhanced self-reflection to address translation challenges. Our results show that CRAT significantly improves translation accuracy, particularly in handling context-sensitive terms and emerging vocabulary.
arXiv Detail & Related papers (2024-10-28T14:29:11Z)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks. Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs. In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z)
Audit-LLM: Multi-Agent Collaboration for Log-based Insider Threat Detection [16.154903877808795]
Audit-LLM is a multi-agent log-based insider threat detection framework comprising three collaborative agents. We propose a pair-wise Evidence-based Multi-agent Debate (EMAD) mechanism, where two independent Executors iteratively refine their conclusions through reasoning exchange to reach a consensus.
arXiv Detail & Related papers (2024-08-12T11:33:45Z)
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains [54.117238759317004]
Massive Multitask Agent Understanding (MMAU) benchmark features comprehensive offline tasks that eliminate the need for complex environment setups. It evaluates models across five domains, including Tool-use, Directed Acyclic Graph (DAG) QA, Data Science and Machine Learning coding, Contest-level programming and Mathematics. With a total of 20 meticulously designed tasks encompassing over 3K distinct prompts, MMAU provides a comprehensive framework for evaluating the strengths and limitations of LLM agents.
arXiv Detail & Related papers (2024-07-18T00:58:41Z)
RAG-based Crowdsourcing Task Decomposition via Masked Contrastive Learning with Prompts [21.69333828191263]
We propose a retrieval-augmented generation-based crowdsourcing framework that reimagines task decomposition (TD) as event detection from the perspective of natural language understanding. We present a Prompt-Based Contrastive learning framework for TD (PBCT), which incorporates a prompt-based trigger detector to overcome dependence. Experiment results demonstrate the competitiveness of our method in both supervised and zero-shot detection.
arXiv Detail & Related papers (2024-06-04T08:34:19Z)
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM [83.6663322930814]
We argue that converging visual context acquisition and logical reasoning is pivotal for tackling visual reasoning tasks. We propose an innovative multimodal CoT framework, termed Cantor, characterized by a perception-decision architecture. Our experiments demonstrate the efficacy of the proposed framework, showing significant improvements in multimodal CoT performance.
arXiv Detail & Related papers (2024-04-24T17:59:48Z)
SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection [18.356648843815627]
Out-of-context (OOC) misinformation is one of the easiest and most effective ways to mislead audiences. Current methods focus on assessing image-text consistency but lack convincing explanations for their judgments. We introduce SNIFFER, a novel multimodal large language model specifically engineered for OOC misinformation detection and explanation.
arXiv Detail & Related papers (2024-03-05T18:04:59Z)
Beyond the Known: Investigating LLMs Performance on Out-of-Domain Intent Detection [34.135738700682055]
This paper conducts a comprehensive evaluation of large language models (LLMs) represented by ChatGPT. We find that LLMs exhibit strong zero-shot and few-shot capabilities, but is still at a disadvantage compared to models fine-tuned with full resource.
arXiv Detail & Related papers (2024-02-27T07:02:10Z)
Large Multimodal Agents: A Survey [78.81459893884737]
Large language models (LLMs) have achieved superior performance in powering text-based AI agents. There is an emerging research trend focused on extending these LLM-powered AI agents into the multimodal domain. This review aims to provide valuable insights and guidelines for future research in this rapidly evolving field.
arXiv Detail & Related papers (2024-02-23T06:04:23Z)
LEMMA: Towards LVLM-Enhanced Multimodal Misinformation Detection with External Knowledge Augmentation [58.524237916836164]
We propose LEMMA: LVLM-Enhanced Multimodal Misinformation Detection with External Knowledge Augmentation. Our method improves the accuracy over the top baseline LVLM by 7% and 13% on Twitter and Fakeddit datasets respectively.
arXiv Detail & Related papers (2024-02-19T08:32:27Z)
Learning to Break: Knowledge-Enhanced Reasoning in Multi-Agent Debate System [16.830182915504555]
Multi-agent debate system (MAD) imitates the process of human discussion in pursuit of truth. It is challenging to make various agents perform right and highly consistent cognition due to their limited and different knowledge backgrounds. We propose a novel underlineMulti-underlineAgent underlineDebate with underlineKnowledge-underlineEnhanced framework to promote the system to find the solution.
arXiv Detail & Related papers (2023-12-08T06:22:12Z)
From Chaos to Clarity: Claim Normalization to Empower Fact-Checking [57.024192702939736]
Claim Normalization (aka ClaimNorm) aims to decompose complex and noisy social media posts into more straightforward and understandable forms. We propose CACN, a pioneering approach that leverages chain-of-thought and claim check-worthiness estimation. Our experiments demonstrate that CACN outperforms several baselines across various evaluation measures.
arXiv Detail & Related papers (2023-10-22T16:07:06Z)
Stance Detection with Collaborative Role-Infused LLM-Based Agents [39.75103353173015]
Stance detection is vital for content analysis in web and social media research. However, stance detection requires advanced reasoning to infer authors' implicit viewpoints. We design a three-stage framework in which LLMs are designated distinct roles. We achieve state-of-the-art performance across multiple datasets.
arXiv Detail & Related papers (2023-10-16T14:46:52Z)
Concise and Organized Perception Facilitates Reasoning in Large Language Models [31.238220405009617]
Exploiting large language models (LLMs) to tackle reasoning has garnered growing attention. It still remains highly challenging to achieve satisfactory results in complex logical problems, characterized by plenty of premises within the context and requiring multi-hop reasoning. In this work, we first examine the mechanism from the perspective of information flow and reveal that LLMs confront difficulties akin to human-like cognitive biases when dealing with disordered and irrelevant content in reasoning tasks.
arXiv Detail & Related papers (2023-10-05T04:47:49Z)
Exploiting Modality-Specific Features For Multi-Modal Manipulation Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks. Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment. We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z)
Building Interpretable and Reliable Open Information Retriever for New Domains Overnight [67.03842581848299]
Information retrieval is a critical component for many down-stream tasks such as open-domain question answering (QA) We propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query. We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks.
arXiv Detail & Related papers (2023-08-09T07:47:17Z)
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate [85.3444184685235]
We propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution. Our framework encourages divergent thinking in LLMs which would be helpful for tasks that require deep levels of contemplation.
arXiv Detail & Related papers (2023-05-30T15:25:45Z)
Synergistic Interplay between Search and Large Language Models for Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections. InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.