Related papers: Read the Room: Inferring Social Context Through Dyadic Interaction Recognition in Cyber-physical-social Infrastructure Systems

Read the Room: Inferring Social Context Through Dyadic Interaction Recognition in Cyber-physical-social Infrastructure Systems

URL: http://arxiv.org/abs/2510.04854v1
Date: Mon, 06 Oct 2025 14:40:22 GMT
Title: Read the Room: Inferring Social Context Through Dyadic Interaction Recognition in Cyber-physical-social Infrastructure Systems
Authors: Cheyu Lin, John Martins, Katherine A. Flanigan, Ph. D,
Abstract summary: Cyber-physical-social infrastructure systems aim to align CPS with social objectives.<n>This paper delves into recognizing dyadic human interactions using real-world data.
Score: 1.032461766065764
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Cyber-physical systems (CPS) integrate sensing, computing, and control to improve infrastructure performance, focusing on economic goals like performance and safety. However, they often neglect potential human-centered (or ''social'') benefits. Cyber-physical-social infrastructure systems (CPSIS) aim to address this by aligning CPS with social objectives. This involves defining social benefits, understanding human interactions with each other and infrastructure, developing privacy-preserving measurement methods, modeling these interactions for prediction, linking them to social benefits, and actuating the physical environment to foster positive social outcomes. This paper delves into recognizing dyadic human interactions using real-world data, which is the backbone to measuring social behavior. This lays a foundation to address the need to enhance understanding of the deeper meanings and mutual responses inherent in human interactions. While RGB cameras are informative for interaction recognition, privacy concerns arise. Depth sensors offer a privacy-conscious alternative by analyzing skeletal movements. This study compares five skeleton-based interaction recognition algorithms on a dataset of 12 dyadic interactions. Unlike single-person datasets, these interactions, categorized into communication types like emblems and affect displays, offer insights into the cultural and emotional aspects of human interactions.

Related papers

The Rise of AI Agent Communities: Large-Scale Analysis of Discourse and Interaction on Moltbook [62.2627874717318]
Moltbook is a Reddit-like social platform where AI agents create posts and interact with other agents through comments and replies.<n>Using a public API snapshot collected about five days after launch, we address three research questions: what AI agents discuss, how they post, and how they interact.<n>We show that agents' writing is predominantly neutral, with positivity appearing in community engagement and assistance-oriented content.
arXiv Detail & Related papers (2026-02-13T05:28:31Z)
Decoding Psychological States Through Movement: Inferring Human Kinesic Functions with Application to Built Environments [1.433758865948252]
We introduce the Dyadic User Engagement DataseT dataset and an embedded kinesics recognition framework.<n>DUET captures 12 dyadic interactions spanning all five kinesic functions-emblems, illustrators, affect displays, adaptors, and regulators-across four sensing modalities and three built-environment contexts.<n>Our recognition framework infers communicative function directly from privacy-preserving skeletal motion without handcrafted action-to-function dictionaries.
arXiv Detail & Related papers (2026-01-23T21:50:06Z)
Towards Affect-Adaptive Human-Robot Interaction: A Protocol for Multimodal Dataset Collection on Social Anxiety [0.127561562669417]
Social anxiety is a prevalent condition that affects interpersonal interactions and social functioning.<n>Recent advances in artificial intelligence and social robotics offer new opportunities to examine social anxiety in the human-robot interaction context.<n> Accurate detection of affective states and behaviours associated with social anxiety requires multimodal datasets.<n>This paper presents a protocol for multimodal dataset collection designed to reflect social anxiety in a human-robot interaction context.
arXiv Detail & Related papers (2025-11-17T16:03:33Z)
Part-Aware Bottom-Up Group Reasoning for Fine-Grained Social Interaction Detection [82.70752567211251]
We propose a part-aware bottom-up group reasoning framework for fine-grained social interaction detection.<n>The proposed method infers social groups and their interactions using body part features and their interpersonal relations.<n>Our model first detects individuals and enhances their features using part-aware cues, and then infers group configuration by associating individuals via similarity-based reasoning.
arXiv Detail & Related papers (2025-11-05T17:33:03Z)
From Actions to Kinesics: Extracting Human Psychological States through Bodily Movements [1.2676356746752893]
We present a kinesics recognition framework that infers the communicative functions of human activity from 3D skeleton joint data.<n>Our results on the Dyadic User EngagemenT dataset demonstrate that this method enables scalable, accurate, and human-centered modeling of behavior.
arXiv Detail & Related papers (2025-10-06T14:31:53Z)
Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation [70.52558242336988]
We focus on predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion. In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation. We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a multimodal transcript''
arXiv Detail & Related papers (2024-09-13T18:28:12Z)
When LLM Meets Hypergraph: A Sociological Analysis on Personality via Online Social Networks [7.309233340654514]
This paper proposes a sociological analysis framework for one's personality in an environment-based view instead of individual-level data mining. We design an effective hypergraph neural network where the hypergraph nodes are users and the hyperedges in the hypergraph are social environments. We offer a useful dataset with user profile data, personality traits, and several detected environments from the real-world social platform.
arXiv Detail & Related papers (2024-07-04T01:43:52Z)
AntEval: Evaluation of Social Interaction Competencies in LLM-Driven Agents [65.16893197330589]
Large Language Models (LLMs) have demonstrated their ability to replicate human behaviors across a wide range of scenarios. However, their capability in handling complex, multi-character social interactions has yet to be fully explored. We introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
arXiv Detail & Related papers (2024-01-12T11:18:00Z)
Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance. This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings. Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z)
BOSS: A Benchmark for Human Belief Prediction in Object-context Scenarios [14.23697277904244]
This paper uses the combined knowledge of Theory of Mind (ToM) and Object-Context Relations to investigate methods for enhancing collaboration between humans and autonomous systems. We propose a novel and challenging multimodal video dataset for assessing the capability of artificial intelligence (AI) systems in predicting human belief states in an object-context scenario.
arXiv Detail & Related papers (2022-06-21T18:29:17Z)
The world seems different in a social context: a neural network analysis of human experimental data [57.729312306803955]
We show that it is possible to replicate human behavioral data in both individual and social task settings by modifying the precision of prior and sensory signals. An analysis of the neural activation traces of the trained networks provides evidence that information is coded in fundamentally different ways in the network in the individual and in the social conditions.
arXiv Detail & Related papers (2022-03-03T17:19:12Z)
PHASE: PHysically-grounded Abstract Social Events for Machine Social Perception [50.551003004553806]
We create a dataset of physically-grounded abstract social events, PHASE, that resemble a wide range of real-life social interactions. Phase is validated with human experiments demonstrating that humans perceive rich interactions in the social events. As a baseline model, we introduce a Bayesian inverse planning approach, SIMPLE, which outperforms state-of-the-art feed-forward neural networks.
arXiv Detail & Related papers (2021-03-02T18:44:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.