Unmasking the Shadows of AI: Investigating Deceptive Capabilities in Large Language Models
- URL: http://arxiv.org/abs/2403.09676v1
- Date: Wed, 7 Feb 2024 00:21:46 GMT
- Title: Unmasking the Shadows of AI: Investigating Deceptive Capabilities in Large Language Models
- Authors: Linge Guo,
- Abstract summary: This research critically navigates the intricate landscape of AI deception, concentrating on deceptive behaviours of Large Language Models (LLMs)
My objective is to elucidate this issue, examine the discourse surrounding it, and subsequently delve into its categorization and ramifications.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This research critically navigates the intricate landscape of AI deception, concentrating on deceptive behaviours of Large Language Models (LLMs). My objective is to elucidate this issue, examine the discourse surrounding it, and subsequently delve into its categorization and ramifications. The essay initiates with an evaluation of the AI Safety Summit 2023 (ASS) and introduction of LLMs, emphasising multidimensional biases that underlie their deceptive behaviours.The literature review covers four types of deception categorised: Strategic deception, Imitation, Sycophancy, and Unfaithful Reasoning, along with the social implications and risks they entail. Lastly, I take an evaluative stance on various aspects related to navigating the persistent challenges of the deceptive AI. This encompasses considerations of international collaborative governance, the reconfigured engagement of individuals with AI, proposal of practical adjustments, and specific elements of digital education.
Related papers
- OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI [73.75520820608232]
We introduce OlympicArena, which includes 11,163 bilingual problems across both text-only and interleaved text-image modalities.
These challenges encompass a wide range of disciplines spanning seven fields and 62 international Olympic competitions, rigorously examined for data leakage.
Our evaluations reveal that even advanced models like GPT-4o only achieve a 39.97% overall accuracy, illustrating current AI limitations in complex reasoning and multimodal integration.
arXiv Detail & Related papers (2024-06-18T16:20:53Z) - Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions [101.67121669727354]
Recent advancements in AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment.
The lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment.
We introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML)
arXiv Detail & Related papers (2024-06-13T16:03:25Z) - Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making [47.33241893184721]
In AI-assisted decision-making, humans often passively review AI's suggestion and decide whether to accept or reject it as a whole.
We propose Human-AI Deliberation, a novel framework to promote human reflection and discussion on conflicting human-AI opinions in decision-making.
Based on theories in human deliberation, this framework engages humans and AI in dimension-level opinion elicitation, deliberative discussion, and decision updates.
arXiv Detail & Related papers (2024-03-25T14:34:06Z) - Responsible AI Considerations in Text Summarization Research: A Review
of Current Practices [89.85174013619883]
We focus on text summarization, a common NLP task largely overlooked by the responsible AI community.
We conduct a multi-round qualitative analysis of 333 summarization papers from the ACL Anthology published between 2020-2022.
We focus on how, which, and when responsible AI issues are covered, which relevant stakeholders are considered, and mismatches between stated and realized research goals.
arXiv Detail & Related papers (2023-11-18T15:35:36Z) - The Robust Semantic Segmentation UNCV2023 Challenge Results [99.97867942388486]
This paper outlines the winning solutions employed in addressing the MUAD uncertainty quantification challenge held at ICCV 2023.
The challenge was centered around semantic segmentation in urban environments, with a particular focus on natural adversarial scenarios.
The report presents the results of 19 submitted entries, with numerous techniques drawing inspiration from cutting-edge uncertainty quantification methodologies.
arXiv Detail & Related papers (2023-09-27T08:20:03Z) - Ethical Framework for Harnessing the Power of AI in Healthcare and
Beyond [0.0]
This comprehensive research article rigorously investigates the ethical dimensions intricately linked to the rapid evolution of AI technologies.
Central to this article is the proposition of a conscientious AI framework, meticulously crafted to accentuate values of transparency, equity, answerability, and a human-centric orientation.
The article unequivocally accentuates the pressing need for globally standardized AI ethics principles and frameworks.
arXiv Detail & Related papers (2023-08-31T18:12:12Z) - Is AI Changing the Rules of Academic Misconduct? An In-depth Look at
Students' Perceptions of 'AI-giarism' [0.0]
This study explores students' perceptions of AI-giarism, an emergent form of academic dishonesty involving AI and plagiarism.
The findings portray a complex landscape of understanding, with clear disapproval for direct AI content generation.
The study provides pivotal insights for academia, policy-making, and the broader integration of AI technology in education.
arXiv Detail & Related papers (2023-06-06T02:22:08Z) - Fairness in Agreement With European Values: An Interdisciplinary
Perspective on AI Regulation [61.77881142275982]
This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them.
We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives.
We identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
arXiv Detail & Related papers (2022-06-08T12:32:08Z) - Artificial Intelligence Narratives: An Objective Perspective on Current
Developments [0.0]
This work provides a starting point for researchers interested in gaining a deeper understanding of the big picture of artificial intelligence (AI)
An essential takeaway for the reader is that AI must be understood as an umbrella term encompassing a plethora of different methods, schools of thought, and their respective historical movements.
arXiv Detail & Related papers (2021-03-18T17:33:00Z) - Transdisciplinary AI Observatory -- Retrospective Analyses and
Future-Oriented Contradistinctions [22.968817032490996]
This paper motivates the need for an inherently transdisciplinary AI observatory approach.
Building on these AI observatory tools, we present near-term transdisciplinary guidelines for AI safety.
arXiv Detail & Related papers (2020-11-26T16:01:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.