MindScope: Exploring cognitive biases in large language models through Multi-Agent Systems
- URL: http://arxiv.org/abs/2410.04452v1
- Date: Sun, 6 Oct 2024 11:23:56 GMT
- Title: MindScope: Exploring cognitive biases in large language models through Multi-Agent Systems
- Authors: Zhentao Xie, Jiabao Zhao, Yilei Wang, Jinxin Shi, Yanhong Bai, Xingjiao Wu, Liang He,
- Abstract summary: We introduce the 'MindScope' dataset, which distinctively integrates static and dynamic elements.
The static component comprises 5,170 open-ended questions spanning 72 cognitive bias categories.
The dynamic component leverages a rule-based, multi-agent communication framework to facilitate the generation of multi-round dialogues.
In addition, we introduce a multi-agent detection method applicable to a wide range of detection tasks, which integrates Retrieval-Augmented Generation (RAG), competitive debate, and a reinforcement learning-based decision module.
- Score: 12.245537894266803
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Detecting cognitive biases in large language models (LLMs) is a fascinating task that aims to probe the existing cognitive biases within these models. Current methods for detecting cognitive biases in language models generally suffer from incomplete detection capabilities and a restricted range of detectable bias types. To address this issue, we introduced the 'MindScope' dataset, which distinctively integrates static and dynamic elements. The static component comprises 5,170 open-ended questions spanning 72 cognitive bias categories. The dynamic component leverages a rule-based, multi-agent communication framework to facilitate the generation of multi-round dialogues. This framework is flexible and readily adaptable for various psychological experiments involving LLMs. In addition, we introduce a multi-agent detection method applicable to a wide range of detection tasks, which integrates Retrieval-Augmented Generation (RAG), competitive debate, and a reinforcement learning-based decision module. Demonstrating substantial effectiveness, this method has shown to improve detection accuracy by as much as 35.10% compared to GPT-4. Codes and appendix are available at https://github.com/2279072142/MindScope.
Related papers
- Unmasking Conversational Bias in AI Multiagent Systems [1.0705399532413618]
biases that may arise in multi-agent systems involving generative models remain under-researched.
We present a framework designed to quantify biases within multi-agent systems of conversational Large Language Models.
The bias observed in the echo-chamber experiment remains undetected by current state-of-the-art bias detection methods.
arXiv Detail & Related papers (2025-01-24T09:10:02Z) - Binary Code Similarity Detection via Graph Contrastive Learning on Intermediate Representations [52.34030226129628]
Binary Code Similarity Detection (BCSD) plays a crucial role in numerous fields, including vulnerability detection, malware analysis, and code reuse identification.
In this paper, we propose IRBinDiff, which mitigates compilation differences by leveraging LLVM-IR with higher-level semantic abstraction.
Our extensive experiments, conducted under varied compilation settings, demonstrate that IRBinDiff outperforms other leading BCSD methods in both One-to-one comparison and One-to-many search scenarios.
arXiv Detail & Related papers (2024-10-24T09:09:20Z) - Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection [37.57355457749918]
We introduce a novel framework for zero-shot HOI detection using Conditional Multi-Modal Prompts, namely CMMP.
Unlike traditional prompt-learning methods, we propose learning decoupled vision and language prompts for interactiveness-aware visual feature extraction.
Experiments demonstrate the efficacy of our detector with conditional multi-modal prompts, outperforming previous state-of-the-art on unseen classes of various zero-shot settings.
arXiv Detail & Related papers (2024-08-05T14:05:25Z) - FairMonitor: A Dual-framework for Detecting Stereotypes and Biases in Large Language Models [9.385390205833893]
We propose the FairMonitor framework and adopt a static-dynamic detection method for a comprehensive evaluation of stereotypes and biases in Large Language Models (LLMs)
The static component consists of a direct inquiry test, an implicit association test, and an unknown situation test, including 10,262 open-ended questions with 9 sensitive factors and 26 educational scenarios.
We utilize the multi-agent system to construst the dynamic scenarios for detecting subtle biases in more complex and realistic setting.
arXiv Detail & Related papers (2024-05-06T01:23:07Z) - Toward Multi-class Anomaly Detection: Exploring Class-aware Unified Model against Inter-class Interference [67.36605226797887]
We introduce a Multi-class Implicit Neural representation Transformer for unified Anomaly Detection (MINT-AD)
By learning the multi-class distributions, the model generates class-aware query embeddings for the transformer decoder.
MINT-AD can project category and position information into a feature embedding space, further supervised by classification and prior probability loss functions.
arXiv Detail & Related papers (2024-03-21T08:08:31Z) - HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs) [0.09208007322096533]
We present HuntGPT, a specialized intrusion detection dashboard applying a Random Forest classifier.
The paper delves into the system's architecture, components, and technical accuracy, assessed through Certified Information Security Manager (CISM) Practice Exams.
The results demonstrate that conversational agents, supported by LLM and integrated with XAI, provide robust, explainable, and actionable AI solutions in intrusion detection.
arXiv Detail & Related papers (2023-09-27T20:58:13Z) - Interactive System-wise Anomaly Detection [66.3766756452743]
Anomaly detection plays a fundamental role in various applications.
It is challenging for existing methods to handle the scenarios where the instances are systems whose characteristics are not readily observed as data.
We develop an end-to-end approach which includes an encoder-decoder module that learns system embeddings.
arXiv Detail & Related papers (2023-04-21T02:20:24Z) - Semantic Novelty Detection via Relational Reasoning [17.660958043781154]
We propose a novel representation learning paradigm based on relational reasoning.
Our experiments show that this knowledge is directly transferable to a wide range of scenarios.
It can be exploited as a plug-and-play module to convert closed-set recognition models into reliable open-set ones.
arXiv Detail & Related papers (2022-07-18T15:49:27Z) - Multi-Modal Few-Shot Object Detection with Meta-Learning-Based
Cross-Modal Prompting [77.69172089359606]
We study multi-modal few-shot object detection (FSOD) in this paper, using both few-shot visual examples and class semantic information for detection.
Our approach is motivated by the high-level conceptual similarity of (metric-based) meta-learning and prompt-based learning.
We comprehensively evaluate the proposed multi-modal FSOD models on multiple few-shot object detection benchmarks, achieving promising results.
arXiv Detail & Related papers (2022-04-16T16:45:06Z) - Exploiting Multi-Object Relationships for Detecting Adversarial Attacks
in Complex Scenes [51.65308857232767]
Vision systems that deploy Deep Neural Networks (DNNs) are known to be vulnerable to adversarial examples.
Recent research has shown that checking the intrinsic consistencies in the input data is a promising way to detect adversarial attacks.
We develop a novel approach to perform context consistency checks using language models.
arXiv Detail & Related papers (2021-08-19T00:52:10Z) - Discriminative Nearest Neighbor Few-Shot Intent Detection by
Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity.
We present a discriminative nearest neighbor classification with deep self-attention.
We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.