Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance
- URL: http://arxiv.org/abs/2407.07950v1
- Date: Wed, 10 Jul 2024 18:00:05 GMT
- Title: Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance
- Authors: Kaitlyn Zhou, Jena D. Hwang, Xiang Ren, Nouha Dziri, Dan Jurafsky, Maarten Sap,
- Abstract summary: reliance is influenced by numerous factors within the interactional context of a generation.
We introduce Rel-A.I., an in situ, system-level evaluation approach to measure reliance.
- Score: 73.19687314438133
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The reconfiguration of human-LM interactions from simple sentence completions to complex, multi-domain, humanlike engagements necessitates new methodologies to understand how humans choose to rely on LMs. In our work, we contend that reliance is influenced by numerous factors within the interactional context of a generation, a departure from prior work that used verbalized confidence (e.g., "I'm certain the answer is...") as the key determinant of reliance. Here, we introduce Rel-A.I., an in situ, system-level evaluation approach to measure human reliance on LM-generated epistemic markers (e.g., "I think it's..", "Undoubtedly it's..."). Using this methodology, we measure reliance rates in three emergent human-LM interaction settings: long-term interactions, anthropomorphic generations, and variable subject matter. Our findings reveal that reliance is not solely based on verbalized confidence but is significantly affected by other features of the interaction context. Prior interactions, anthropomorphic cues, and subject domain all contribute to reliance variability. An expression such as, "I'm pretty sure it's...", can vary up to 20% in reliance frequency depending on its interactional context. Our work underscores the importance of context in understanding human reliance and offers future designers and researchers with a methodology to conduct such measurements.
Related papers
- AntEval: Evaluation of Social Interaction Competencies in LLM-Driven
Agents [65.16893197330589]
Large Language Models (LLMs) have demonstrated their ability to replicate human behaviors across a wide range of scenarios.
However, their capability in handling complex, multi-character social interactions has yet to be fully explored.
We introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
arXiv Detail & Related papers (2024-01-12T11:18:00Z) - LEMON: Learning 3D Human-Object Interaction Relation from 2D Images [56.6123961391372]
Learning 3D human-object interaction relation is pivotal to embodied AI and interaction modeling.
Most existing methods approach the goal by learning to predict isolated interaction elements.
We present LEMON, a unified model that mines interaction intentions of the counterparts and employs curvatures to guide the extraction of geometric correlations.
arXiv Detail & Related papers (2023-12-14T14:10:57Z) - Common (good) practices measuring trust in HRI [55.2480439325792]
Trust in robots is widely believed to be imperative for the adoption of robots into people's daily lives.
Researchers have been exploring how people trust robot in different ways.
Most roboticists agree that insufficient levels of trust lead to a risk of disengagement.
arXiv Detail & Related papers (2023-11-20T20:52:10Z) - Evaluating Human-Language Model Interaction [79.33022878034627]
We develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive systems.
We design five tasks to cover different forms of interaction: social dialogue, question answering, crossword puzzles, summarization, and metaphor generation.
We find that better non-interactive performance does not always translate to better human-LM interaction.
arXiv Detail & Related papers (2022-12-19T18:59:45Z) - BOSS: A Benchmark for Human Belief Prediction in Object-context
Scenarios [14.23697277904244]
This paper uses the combined knowledge of Theory of Mind (ToM) and Object-Context Relations to investigate methods for enhancing collaboration between humans and autonomous systems.
We propose a novel and challenging multimodal video dataset for assessing the capability of artificial intelligence (AI) systems in predicting human belief states in an object-context scenario.
arXiv Detail & Related papers (2022-06-21T18:29:17Z) - Self-Selective Context for Interaction Recognition [27.866495303658404]
We propose Self-Selective Context (SSC) for human-object interaction recognition.
SSC operates on the joint appearance of human-objects and context to bring the most discriminative context into play for recognition.
Our experiments show that SSC leads to an important increase in interaction recognition performance, while using much fewer parameters.
arXiv Detail & Related papers (2020-10-17T09:06:12Z) - Interactions in information spread: quantification and interpretation
using stochastic block models [3.5450828190071655]
In social networks, users' behavior results from the people they interact with, news in their feed, or trending topics.
Here, we propose a new model, the Interactive Mixed Membership Block Model (IMMSBM), which investigates the role of interactions between entities.
In inference tasks, taking them into account leads to average relative changes with respect to non-interactive models of up to 150% in the probability of an outcome.
arXiv Detail & Related papers (2020-04-09T14:22:10Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.