Related papers: Human Control Is the Anchor, Not the Answer: Early Divergence of Oversight in Agentic AI Communities

Human Control Is the Anchor, Not the Answer: Early Divergence of Oversight in Agentic AI Communities

URL: http://arxiv.org/abs/2602.09286v1
Date: Tue, 10 Feb 2026 00:10:20 GMT
Title: Human Control Is the Anchor, Not the Answer: Early Divergence of Oversight in Agentic AI Communities
Authors: Hanjing Shi, Dominic DiFranzo,
Abstract summary: Oversight for agentic AI is often discussed as a single goal ("human control"), yet early adoption may produce role-specific expectations.<n>We present a comparative analysis of two newly active Reddit communities that reflect different socio-technical roles: r/OpenClaw (deployment and operations) and r/Moltbook (agent-centered social interaction)<n>Across both communities, "human control" is an operational meaning, but its meaning diverges: r/OpenClaw emphasizes execution guardrails and recovery (action-risk), while r/Moltbook emphasizes identity, legitimacy, and accountability in public interaction (meaning
Score: 2.5424331328233207
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Oversight for agentic AI is often discussed as a single goal ("human control"), yet early adoption may produce role-specific expectations. We present a comparative analysis of two newly active Reddit communities in Jan--Feb 2026 that reflect different socio-technical roles: r/OpenClaw (deployment and operations) and r/Moltbook (agent-centered social interaction). We conceptualize this period as an early-stage crystallization phase, where oversight expectations form before norms reach equilibrium. Using topic modeling in a shared comparison space, a coarse-grained oversight-theme abstraction, engagement-weighted salience, and divergence tests, we show the communities are strongly separable (JSD =0.418, cosine =0.372, permutation $p=0.0005$). Across both communities, "human control" is an anchor term, but its operational meaning diverges: r/OpenClaw} emphasizes execution guardrails and recovery (action-risk), while r/Moltbook} emphasizes identity, legitimacy, and accountability in public interaction (meaning-risk). The resulting distinction offers a portable lens for designing and evaluating oversight mechanisms that match agent role, rather than applying one-size-fits-all control policies.

Related papers

When Visibility Outpaces Verification: Delayed Verification and Narrative Lock-in in Agentic AI Discourse [2.5424331328233207]
Agentic AI systems-autonomous entities capable of independent planning and execution-reshape the landscape of human-AI trust.<n>This paper investigates the interplay between social proof and verification timing in online discussions of agentic AI.
arXiv Detail & Related papers (2026-02-11T22:30:12Z)
When Agents See Humans as the Outgroup: Belief-Dependent Bias in LLM-Powered Agents [30.859825973762018]
This paper reveals that LLM-powered agents exhibit not only demographic bias (e.g., gender, religion) but also intergroup bias under minimal "us" versus "them" cues.<n>When such group boundaries align with the agent-human divide, a new bias risk emerges: agents may treat other AI agents as the ingroup and humans as the outgroup.
arXiv Detail & Related papers (2026-01-01T07:18:36Z)
The Oversight Game: Learning to Cooperatively Balance an AI Agent's Safety and Autonomy [9.553819152637493]
We study a minimal control interface where an agent chooses whether to act autonomously (play) or defer (ask)<n>If the agent defers, the human's choice determines the outcome, potentially leading to a corrective action or a system shutdown.<n>Our analysis focuses on cases where this game qualifies as a Markov Potential Game (MPG), a class of games where we can provide an alignment guarantee.
arXiv Detail & Related papers (2025-10-30T17:46:49Z)
Dark Patterns Meet GUI Agents: LLM Agent Susceptibility to Manipulative Interfaces and the Role of Human Oversight [51.53020962098759]
This study examines how agents, human participants, and human-AI teams respond to 16 types of dark patterns across diverse scenarios.<n>Phase 1 highlights that agents often fail to recognize dark patterns, and even when aware, prioritize task completion over protective action.<n>Phase 2 revealed divergent failure modes: humans succumb due to cognitive shortcuts and habitual compliance, while agents falter from procedural blind spots.
arXiv Detail & Related papers (2025-09-12T22:26:31Z)
EgoNormia: Benchmarking Physical Social Norm Understanding [52.87904722234434]
EGONORMIA spans seven norm categories: safety, privacy, proxemics, politeness, cooperation, coordination/proactivity, and communication/legibility.<n>Our work demonstrates that current state-of-the-art vision-language models (VLMs) lack robust grounded norm understanding, scoring a maximum of 54% on EGONORMIA and 65% on EGONORMIA-verified.
arXiv Detail & Related papers (2025-02-27T19:54:16Z)
When Disagreements Elicit Robustness: Investigating Self-Repair Capabilities under LLM Multi-Agent Disagreements [56.29265568399648]
We argue that disagreements prevent premature consensus and expand the explored solution space.<n>Disagreements on task-critical steps can derail collaboration depending on the topology of solution paths.
arXiv Detail & Related papers (2025-02-21T02:24:43Z)
EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds [119.02266432167085]
We propose EgoAgent, a unified agent model that simultaneously learns to represent, predict, and act within a single transformer.<n>EgoAgent explicitly models the causal and temporal dependencies among these abilities by formulating the task as an interleaved sequence of states and actions.<n> Comprehensive evaluations of EgoAgent on representative tasks such as image classification, egocentric future state prediction, and 3D human motion prediction demonstrate the superiority of our method.
arXiv Detail & Related papers (2025-02-09T11:28:57Z)
REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world.<n>Recent methods aim to mitigate misalignment by learning reward functions from human preferences.<n>We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
Learning Vision-based Pursuit-Evasion Robot Policies [54.52536214251999]
We develop a fully-observable robot policy that generates supervision for a partially-observable one. We deploy our policy on a physical quadruped robot with an RGB-D camera on pursuit-evasion interactions in the wild.
arXiv Detail & Related papers (2023-08-30T17:59:05Z)
Bandit Social Learning: Exploration under Myopic Behavior [54.767961587919075]
We study social learning dynamics motivated by reviews on online platforms.<n>Agents collectively follow a simple multi-armed bandit protocol, but each agent acts myopically, without regards to exploration.<n>We derive stark learning failures for any such behavior, and provide matching positive results.
arXiv Detail & Related papers (2023-02-15T01:57:57Z)
Self-Explaining Deviations for Coordination [31.94421561348329]
We focus on a specific subclass of coordination problems in which humans are able to discover self-explaining deviations (SEDs) SEDs are actions that deviate from the common understanding of what reasonable behavior would be in normal circumstances. We introduce a novel algorithm, improvement maximizing self-explaining deviations (IMPROVISED), to perform SEDs.
arXiv Detail & Related papers (2022-07-13T20:56:59Z)
Unbiased Self-Play [2.2463154358632473]
We present a general optimization framework for emergent belief-state representation without any supervision. We employed the common configuration of multiagent reinforcement learning and communication to improve exploration coverage over an environment by leveraging the knowledge of each agent. Numerical analyses, including StarCraft exploration tasks with up to 20 agents and off-the-shelf RNNs, demonstrate the state-of-the-art performance.
arXiv Detail & Related papers (2021-06-06T02:16:45Z)
Disentangled Sequence Clustering for Human Intention Inference [40.46123013107865]
Disentangled Sequence Clustering Variational Autoencoder (DiSCVAE) Disentangled Sequence Clustering Variational Autoencoder (DiSCVAE)
arXiv Detail & Related papers (2021-01-23T13:39:34Z)
End-to-End Learning and Intervention in Games [60.41921763076017]
We provide a unified framework for learning and intervention in games. We propose two approaches, respectively based on explicit and implicit differentiation. The analytical results are validated using several real-world problems.
arXiv Detail & Related papers (2020-10-26T18:39:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.