Persuasion and Safety in the Era of Generative AI
- URL: http://arxiv.org/abs/2505.12248v1
- Date: Sun, 18 May 2025 06:04:46 GMT
- Title: Persuasion and Safety in the Era of Generative AI
- Authors: Haein Kong,
- Abstract summary: The EU AI Act prohibits AI systems that use manipulative or deceptive techniques to undermine informed decision-making.<n>My dissertation addresses the lack of empirical studies in this area by developing a taxonomy of persuasive techniques.<n>It provides resources to mitigate the risks of persuasive AI and fosters discussions on ethical persuasion in the age of generative AI.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As large language models (LLMs) achieve advanced persuasive capabilities, concerns about their potential risks have grown. The EU AI Act prohibits AI systems that use manipulative or deceptive techniques to undermine informed decision-making, highlighting the need to distinguish between rational persuasion, which engages reason, and manipulation, which exploits cognitive biases. My dissertation addresses the lack of empirical studies in this area by developing a taxonomy of persuasive techniques, creating a human-annotated dataset, and evaluating LLMs' ability to distinguish between these methods. This work contributes to AI safety by providing resources to mitigate the risks of persuasive AI and fostering discussions on ethical persuasion in the age of generative AI.
Related papers
- Must Read: A Systematic Survey of Computational Persuasion [60.83151988635103]
AI-driven persuasion can be leveraged for beneficial applications, but also poses threats through manipulation and unethical influence.<n>Our survey outlines future research directions to enhance the safety, fairness, and effectiveness of AI-powered persuasion.
arXiv Detail & Related papers (2025-05-12T17:26:31Z) - How Performance Pressure Influences AI-Assisted Decision Making [57.53469908423318]
We show how pressure and explainable AI (XAI) techniques interact with AI advice-taking behavior.<n>Our results show complex interaction effects, with different combinations of pressure and XAI techniques either improving or worsening AI advice taking behavior.
arXiv Detail & Related papers (2024-10-21T22:39:52Z) - Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI [19.675489660806942]
Generative AI presents a new risk profile of persuasion due to reciprocal exchange and prolonged interactions.
This has led to growing concerns about harms from AI persuasion and how they can be mitigated.
Existing harm mitigation approaches prioritise harms from the outcome of persuasion over harms from the process of persuasion.
arXiv Detail & Related papers (2024-04-23T14:07:20Z) - Assigning AI: Seven Approaches for Students, with Prompts [0.0]
This paper examines the transformative role of Large Language Models (LLMs) in education and their potential as learning tools.
The authors propose seven approaches for utilizing AI in classrooms: AI-tutor, AI-coach, AI-mentor, AI-teammate, AI-tool, AI-simulator, and AI-student.
arXiv Detail & Related papers (2023-06-13T03:36:36Z) - Artificial Influence: An Analysis Of AI-Driven Persuasion [0.0]
We warn that ubiquitous highlypersuasive AI systems could alter our information environment so significantly so as to contribute to a loss of human control of our own future.
We conclude that none of these solutions will be airtight, and that individuals and governments will need to take active steps to guard against the most pernicious effects of persuasive AI.
arXiv Detail & Related papers (2023-03-15T16:05:11Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z) - Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations.
It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z) - Expose Uncertainty, Instill Distrust, Avoid Explanations: Towards
Ethical Guidelines for AI [3.0534660670547864]
I argue that the best way to help humans using AI technology is to make them aware of the intrinsic limitations and problems of AI algorithms.
I suggest three ethical guidelines to be used in the presentation of results.
arXiv Detail & Related papers (2021-11-29T14:53:35Z) - Building Bridges: Generative Artworks to Explore AI Ethics [56.058588908294446]
In recent years, there has been an increased emphasis on understanding and mitigating adverse impacts of artificial intelligence (AI) technologies on society.
A significant challenge in the design of ethical AI systems is that there are multiple stakeholders in the AI pipeline, each with their own set of constraints and interests.
This position paper outlines some potential ways in which generative artworks can play this role by serving as accessible and powerful educational tools.
arXiv Detail & Related papers (2021-06-25T22:31:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.