Related papers: Can We Trust AI Agents? An Experimental Study Towards Trustworthy LLM-Based Multi-Agent Systems for AI Ethics

Can We Trust AI Agents? An Experimental Study Towards Trustworthy LLM-Based Multi-Agent Systems for AI Ethics

URL: http://arxiv.org/abs/2411.08881v1
Date: Fri, 25 Oct 2024 20:17:59 GMT
Title: Can We Trust AI Agents? An Experimental Study Towards Trustworthy LLM-Based Multi-Agent Systems for AI Ethics
Authors: José Antonio Siqueira de Cerqueira, Mamia Agbese, Rebekah Rousi, Nannan Xi, Juho Hamari, Pekka Abrahamsson,
Abstract summary: This study examines how trustworthiness-enhancing techniques affect ethical AI output generation. We design the prototype LLM-BMAS, where agents engage in structured discussions on real-world ethical AI issues. Discussions reveal terms like bias detection, transparency, accountability, user consent, compliance, fairness evaluation, and EU AI Act compliance.
Score: 10.084913433923566
License: http://creativecommons.org/licenses/by/4.0/
Abstract: AI-based systems, including Large Language Models (LLMs), impact millions by supporting diverse tasks but face issues like misinformation, bias, and misuse. Ethical AI development is crucial as new technologies and concerns emerge, but objective, practical ethical guidance remains debated. This study examines LLMs in developing ethical AI systems, assessing how trustworthiness-enhancing techniques affect ethical AI output generation. Using the Design Science Research (DSR) method, we identify techniques for LLM trustworthiness: multi-agents, distinct roles, structured communication, and multiple rounds of debate. We design the multi-agent prototype LLM-BMAS, where agents engage in structured discussions on real-world ethical AI issues from the AI Incident Database. The prototype's performance is evaluated through thematic analysis, hierarchical clustering, ablation studies, and source code execution. Our system generates around 2,000 lines per run, compared to only 80 lines in the ablation study. Discussions reveal terms like bias detection, transparency, accountability, user consent, GDPR compliance, fairness evaluation, and EU AI Act compliance, showing LLM-BMAS's ability to generate thorough source code and documentation addressing often-overlooked ethical AI issues. However, practical challenges in source code integration and dependency management may limit smooth system adoption by practitioners. This study aims to shed light on enhancing trustworthiness in LLMs to support practitioners in developing ethical AI-based systems.

Related papers

Introspection of Thought Helps AI Agents [19.04968632268433]
Large Language Models (LLMs) and Multimodal-LLMs (MLLMs) play the most critical role and determine the initial ability and limitations of AI Agents.<n>We propose a novel AI Agent Reasoning Framework with Introspection of Thought (INoT) by designing a new LLM-Read code in prompt.<n>The effectiveness of INoT is verified, with an average improvement of 7.95% in performance, exceeding the baselines.
arXiv Detail & Related papers (2025-07-11T15:03:17Z)
The AI Imperative: Scaling High-Quality Peer Review in Machine Learning [49.87236114682497]
We argue that AI-assisted peer review must become an urgent research and infrastructure priority.<n>We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making.
arXiv Detail & Related papers (2025-06-09T18:37:14Z)
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios [51.46347732659174]
Large Language Models (LLMs) have demonstrated advanced capabilities in real-world agentic applications.<n>AgentIF is the first benchmark for systematically evaluating LLM instruction following ability in agentic scenarios.
arXiv Detail & Related papers (2025-05-22T17:31:10Z)
Evolution of AI in Education: Agentic Workflows [2.1681971652284857]
Artificial intelligence (AI) has transformed various aspects of education. Large language models (LLMs) are driving advancements in automated tutoring, assessment, and content generation. To address these limitations and foster more sustainable technological practices, AI agents have emerged as a promising new avenue for educational innovation.
arXiv Detail & Related papers (2025-04-25T13:44:57Z)
Do LLMs trust AI regulation? Emerging behaviour of game-theoretic LLM agents [61.132523071109354]
This paper investigates the interplay between AI developers, regulators and users, modelling their strategic choices under different regulatory scenarios. Our research identifies emerging behaviours of strategic AI agents, which tend to adopt more "pessimistic" stances than pure game-theoretic agents.
arXiv Detail & Related papers (2025-04-11T15:41:21Z)
Leveraging LLMs for User Stories in AI Systems: UStAI Dataset [0.38233569758620056]
Large Language Models (LLMs) are emerging as a promising alternative to human-generated text.<n>This paper investigates the potential use of LLMs to generate user stories for AI systems based on abstracts from scholarly papers.<n>Our analysis demonstrates that the investigated LLMs can generate user stories inspired by the needs of various stakeholders.
arXiv Detail & Related papers (2025-04-01T08:03:40Z)
Media and responsible AI governance: a game-theoretic and LLM analysis [61.132523071109354]
This paper investigates the interplay between AI developers, regulators, users, and the media in fostering trustworthy AI systems. Using evolutionary game theory and large language models (LLMs), we model the strategic interactions among these actors under different regulatory regimes.
arXiv Detail & Related papers (2025-03-12T21:39:38Z)
Persuasion with Large Language Models: a Survey [49.86930318312291]
Large Language Models (LLMs) have created new disruptive possibilities for persuasive communication. In areas such as politics, marketing, public health, e-commerce, and charitable giving, such LLM Systems have already achieved human-level or even super-human persuasiveness. Our survey suggests that the current and future potential of LLM-based persuasion poses profound ethical and societal risks.
arXiv Detail & Related papers (2024-11-11T10:05:52Z)
Using AI Alignment Theory to understand the potential pitfalls of regulatory frameworks [55.2480439325792]
This paper critically examines the European Union's Artificial Intelligence Act (EU AI Act) Uses insights from Alignment Theory (AT) research, which focuses on the potential pitfalls of technical alignment in Artificial Intelligence. As we apply these concepts to the EU AI Act, we uncover potential vulnerabilities and areas for improvement in the regulation.
arXiv Detail & Related papers (2024-10-10T17:38:38Z)
Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents [55.63497537202751]
Article explores the convergence of connectionist and symbolic artificial intelligence (AI) Traditionally, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic. Recent advancements in large language models (LLMs) highlight the potential of connectionist architectures in handling human language as a form of symbols.
arXiv Detail & Related papers (2024-07-11T14:00:53Z)
Towards Trustworthy AI: A Review of Ethical and Robust Large Language Models [1.7466076090043157]
Large Language Models (LLMs) could transform many fields, but their fast development creates significant challenges for oversight, ethical creation, and building user trust. This comprehensive review looks at key trust issues in LLMs, such as unintended harms, lack of transparency, vulnerability to attacks, alignment with human values, and environmental impact. To tackle these issues, we suggest combining ethical oversight, industry accountability, regulation, and public involvement.
arXiv Detail & Related papers (2024-06-01T14:47:58Z)
Navigating LLM Ethics: Advancements, Challenges, and Future Directions [5.023563968303034]
This study addresses ethical issues surrounding Large Language Models (LLMs) within the field of artificial intelligence. It explores the common ethical challenges posed by both LLMs and other AI systems. It highlights challenges such as hallucination, verifiable accountability, and decoding censorship complexity.
arXiv Detail & Related papers (2024-05-14T15:03:05Z)
POLARIS: A framework to guide the development of Trustworthy AI systems [3.02243271391691]
There is a significant gap between high-level AI ethics principles and low-level concrete practices for AI professionals. We develop a novel holistic framework for Trustworthy AI - designed to bridge the gap between theory and practice. Our goal is to empower AI professionals to confidently navigate the ethical dimensions of Trustworthy AI.
arXiv Detail & Related papers (2024-02-08T01:05:16Z)
A Preliminary Study on Using Large Language Models in Software Pentesting [2.0551676463612636]
Large language models (LLM) are perceived to offer promising potentials for automating security tasks. We investigate the use of LLMs in software pentesting, where the main task is to automatically identify software security vulnerabilities in source code.
arXiv Detail & Related papers (2024-01-30T21:42:59Z)
Investigating Responsible AI for Scientific Research: An Empirical Study [4.597781832707524]
The push for Responsible AI (RAI) in such institutions underscores the increasing emphasis on integrating ethical considerations within AI design and development. This paper aims to assess the awareness and preparedness regarding the ethical risks inherent in AI design and development. Our results have revealed certain knowledge gaps concerning ethical, responsible, and inclusive AI, with limitations in awareness of the available AI ethics frameworks.
arXiv Detail & Related papers (2023-12-15T06:40:27Z)
Hybrid Approaches for Moral Value Alignment in AI Agents: a Manifesto [3.7414804164475983]
Increasing interest in ensuring the safety of next-generation Artificial Intelligence (AI) systems calls for novel approaches to embedding morality into autonomous agents. We provide a systematization of existing approaches to the problem of introducing morality in machines - modelled as a continuum. We argue that more hybrid solutions are needed to create adaptable and robust, yet controllable and interpretable agentic systems.
arXiv Detail & Related papers (2023-12-04T11:46:34Z)
The Rise and Potential of Large Language Model Based Agents: A Survey [91.71061158000953]
Large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI) We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents. We explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation.
arXiv Detail & Related papers (2023-09-14T17:12:03Z)
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision [84.31474052176343]
Recent AI-assistant agents, such as ChatGPT, rely on supervised fine-tuning (SFT) with human annotations and reinforcement learning from human feedback to align the output with human intentions. This dependence can significantly constrain the true potential of AI-assistant agents due to the high cost of obtaining human supervision. We propose a novel approach called SELF-ALIGN, which combines principle-driven reasoning and the generative power of LLMs for the self-alignment of AI agents with minimal human supervision.
arXiv Detail & Related papers (2023-05-04T17:59:28Z)
Human-Centric Multimodal Machine Learning: Recent Advances and Testbed on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach. Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes. We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z)
An interdisciplinary conceptual study of Artificial Intelligence (AI) for helping benefit-risk assessment practices: Towards a comprehensive qualification matrix of AI programs and devices (pre-print 2020) [55.41644538483948]
This paper proposes a comprehensive analysis of existing concepts coming from different disciplines tackling the notion of intelligence. The aim is to identify shared notions or discrepancies to consider for qualifying AI systems.
arXiv Detail & Related papers (2021-05-07T12:01:31Z)
Multisource AI Scorecard Table for System Evaluation [3.74397577716445]
The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist. The paper explores how the analytic tradecraft standards outlined in Intelligence Community Directive (ICD) 203 can provide a framework for assessing the performance of an AI system.
arXiv Detail & Related papers (2021-02-08T03:37:40Z)
Trustworthy AI in the Age of Pervasive Computing and Big Data [22.92621391190282]
We formalise the requirements of trustworthy AI systems through an ethics perspective. After discussing the state of research and the remaining challenges, we show how a concrete use-case in smart cities can benefit from these methods.
arXiv Detail & Related papers (2020-01-30T08:09:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.