Towards interactive evaluations for interaction harms in human-AI systems
- URL: http://arxiv.org/abs/2405.10632v6
- Date: Sun, 27 Apr 2025 15:44:44 GMT
- Title: Towards interactive evaluations for interaction harms in human-AI systems
- Authors: Lujain Ibrahim, Saffron Huang, Umang Bhatt, Lama Ahmad, Markus Anderljung,
- Abstract summary: We argue for a paradigm shift toward evaluation centered on.<n>textitinteractional ethics.<n>We propose principles for evaluating generative models through interaction scenarios and human impact metrics.
- Score: 8.989911701384788
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current AI evaluation paradigms that rely on static, model-only tests fail to capture harms that emerge through sustained human-AI interaction. As interactive AI systems, such as AI companions, proliferate in daily life, this mismatch between evaluation methods and real-world use becomes increasingly consequential. We argue for a paradigm shift toward evaluation centered on \textit{interactional ethics}, which addresses risks like inappropriate human-AI relationships, social manipulation, and cognitive overreliance that develop through repeated interaction rather than single outputs. Drawing on human-computer interaction, natural language processing, and the social sciences, we propose principles for evaluating generative models through interaction scenarios and human impact metrics. We conclude by examining implementation challenges and open research questions for researchers, practitioners, and regulators integrating these approaches into AI governance frameworks.
Related papers
- Confirmation Bias in Generative AI Chatbots: Mechanisms, Risks, Mitigation Strategies, and Future Research Directions [0.0]
The article analyzes the mechanisms by which confirmation bias may manifest in generative AI chatbots.
It assesses the ethical and practical risks associated with such bias, and proposes a range of mitigation strategies.
The article concludes by emphasizing the need for interdisciplinary collaboration and empirical evaluation to better understand and address confirmation bias in generative AI systems.
arXiv Detail & Related papers (2025-04-12T21:08:36Z) - The Human Robot Social Interaction (HSRI) Dataset: Benchmarking Foundational Models' Social Reasoning [49.32390524168273]
Our work aims to advance the social reasoning of embodied artificial intelligence (AI) agents in real-world social interactions.
We introduce a large-scale real-world Human Robot Social Interaction (HSRI) dataset to benchmark the capabilities of language models (LMs) and foundational models (FMs)
Our dataset consists of 400 real-world human social robot interaction videos and over 10K annotations, detailing the robot's social errors, competencies, rationale, and corrective actions.
arXiv Detail & Related papers (2025-04-07T06:27:02Z) - Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions [101.67121669727354]
Recent advancements in AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment.
The lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment.
We introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML)
arXiv Detail & Related papers (2024-06-13T16:03:25Z) - ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models [53.00812898384698]
We argue that human evaluation of generative large language models (LLMs) should be a multidisciplinary undertaking.
We highlight how cognitive biases can conflate fluent information and truthfulness, and how cognitive uncertainty affects the reliability of rating scores such as Likert.
We propose the ConSiDERS-The-Human evaluation framework consisting of 6 pillars -- Consistency, Scoring Criteria, Differentiating, User Experience, Responsible, and Scalability.
arXiv Detail & Related papers (2024-05-28T22:45:28Z) - AntEval: Evaluation of Social Interaction Competencies in LLM-Driven
Agents [65.16893197330589]
Large Language Models (LLMs) have demonstrated their ability to replicate human behaviors across a wide range of scenarios.
However, their capability in handling complex, multi-character social interactions has yet to be fully explored.
We introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
arXiv Detail & Related papers (2024-01-12T11:18:00Z) - Human-AI collaboration is not very collaborative yet: A taxonomy of interaction patterns in AI-assisted decision making from a systematic review [6.013543974938446]
Leveraging Artificial Intelligence in decision support systems has disproportionately focused on technological advancements.
A human-centered perspective attempts to alleviate this concern by designing AI solutions for seamless integration with existing processes.
arXiv Detail & Related papers (2023-10-30T17:46:38Z) - Sociotechnical Safety Evaluation of Generative AI Systems [13.546708226350963]
Generative AI systems produce a range of risks.
To ensure the safety of generative AI systems, these risks must be evaluated.
We propose a three-layered framework that takes a structured, sociotechnical approach to evaluating these risks.
arXiv Detail & Related papers (2023-10-18T14:13:58Z) - It HAS to be Subjective: Human Annotator Simulation via Zero-shot
Density Estimation [15.8765167340819]
Human annotator simulation (HAS) serves as a cost-effective substitute for human evaluation such as data annotation and system assessment.
Human perception and behaviour during human evaluation exhibit inherent variability due to diverse cognitive processes and subjective interpretations.
This paper introduces a novel meta-learning framework that treats HAS as a zero-shot density estimation problem.
arXiv Detail & Related papers (2023-09-30T20:54:59Z) - Towards Objective Evaluation of Socially-Situated Conversational Robots:
Assessing Human-Likeness through Multimodal User Behaviors [26.003947740875482]
This paper focuses on assessing the human-likeness of the robot as the primary evaluation metric.
Our approach aims to evaluate the robot's human-likeness based on observable user behaviors indirectly, thus enhancing objectivity and objectivity.
arXiv Detail & Related papers (2023-08-21T20:21:07Z) - Human-AI Coevolution [48.74579595505374]
Coevolution AI is a process in which humans and AI algorithms continuously influence each other.
This paper introduces Coevolution AI as the cornerstone for a new field of study at the intersection between AI and complexity science.
arXiv Detail & Related papers (2023-06-23T18:10:54Z) - Evaluating Human-Language Model Interaction [79.33022878034627]
We develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive systems.
We design five tasks to cover different forms of interaction: social dialogue, question answering, crossword puzzles, summarization, and metaphor generation.
We find that better non-interactive performance does not always translate to better human-LM interaction.
arXiv Detail & Related papers (2022-12-19T18:59:45Z) - Revisiting the Gold Standard: Grounding Summarization Evaluation with
Robust Human Evaluation [136.16507050034755]
Existing human evaluation studies for summarization either exhibit a low inter-annotator agreement or have insufficient scale.
We propose a modified summarization salience protocol, Atomic Content Units (ACUs), which is based on fine-grained semantic units.
We curate the Robust Summarization Evaluation (RoSE) benchmark, a large human evaluation dataset consisting of 22,000 summary-level annotations over 28 top-performing systems.
arXiv Detail & Related papers (2022-12-15T17:26:05Z) - Are Neural Topic Models Broken? [81.15470302729638]
We study the relationship between automated and human evaluation of topic models.
We find that neural topic models fare worse in both respects compared to an established classical method.
arXiv Detail & Related papers (2022-10-28T14:38:50Z) - A Mental-Model Centric Landscape of Human-AI Symbiosis [31.14516396625931]
We introduce a significantly general version of human-aware AI interaction scheme, called generalized human-aware interaction (GHAI)
We will see how this new framework allows us to capture the various works done in the space of human-AI interaction and identify the fundamental behavioral patterns supported by these works.
arXiv Detail & Related papers (2022-02-18T22:08:08Z) - Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy
Evaluation Approach [84.02388020258141]
We propose a new framework named ENIGMA for estimating human evaluation scores based on off-policy evaluation in reinforcement learning.
ENIGMA only requires a handful of pre-collected experience data, and therefore does not involve human interaction with the target policy during the evaluation.
Our experiments show that ENIGMA significantly outperforms existing methods in terms of correlation with human evaluation scores.
arXiv Detail & Related papers (2021-02-20T03:29:20Z) - Adversarial Interaction Attack: Fooling AI to Misinterpret Human
Intentions [46.87576410532481]
We show that, despite their current huge success, deep learning based AI systems can be easily fooled by subtle adversarial noise.
Based on a case study of skeleton-based human interactions, we propose a novel adversarial attack on interactions.
Our study highlights potential risks in the interaction loop with AI and humans, which need to be carefully addressed when deploying AI systems in safety-critical applications.
arXiv Detail & Related papers (2021-01-17T16:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.