Related papers: Human-Robot Commensality: Bite Timing Prediction for Robot-Assisted Feeding in Groups

Human-Robot Commensality: Bite Timing Prediction for Robot-Assisted Feeding in Groups

URL: http://arxiv.org/abs/2207.03348v1
Date: Thu, 7 Jul 2022 14:52:58 GMT
Title: Human-Robot Commensality: Bite Timing Prediction for Robot-Assisted Feeding in Groups
Authors: Jan Ondras, Abrar Anwar, Tong Wu, Fanjun Bu, Malte Jung, Jorge Jose Ortiz, Tapomayukh Bhattacharjee
Abstract summary: We develop data-driven models to predict when a robot should feed during social dining scenarios. We use a multimodal Human-Human Commensality dataset to analyze human-human commensality behaviors.
Score: 18.367472953664016
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We develop data-driven models to predict when a robot should feed during social dining scenarios. Being able to eat independently with friends and family is considered one of the most memorable and important activities for people with mobility limitations. Robots can potentially help with this activity but robot-assisted feeding is a multi-faceted problem with challenges in bite acquisition, bite timing, and bite transfer. Bite timing in particular becomes uniquely challenging in social dining scenarios due to the possibility of interrupting a social human-robot group interaction during commensality. Our key insight is that bite timing strategies that take into account the delicate balance of social cues can lead to seamless interactions during robot-assisted feeding in a social dining scenario. We approach this problem by collecting a multimodal Human-Human Commensality Dataset (HHCD) containing 30 groups of three people eating together. We use this dataset to analyze human-human commensality behaviors and develop bite timing prediction models in social dining scenarios. We also transfer these models to human-robot commensality scenarios. Our user studies show that prediction improves when our algorithm uses multimodal social signaling cues between diners to model bite timing. The HHCD dataset, videos of user studies, and code will be publicly released after acceptance.

Related papers

Robot-Assisted Social Dining as a White Glove Service [0.0]
Existing systems have only been tested in-lab or in-home, leaving in-the-wild social dining contexts largely unexplored.<n>Our work has implications for in-the-wild and group contexts of robot-assisted feeding.
arXiv Detail & Related papers (2026-02-17T17:58:25Z)
Towards Affect-Adaptive Human-Robot Interaction: A Protocol for Multimodal Dataset Collection on Social Anxiety [0.127561562669417]
Social anxiety is a prevalent condition that affects interpersonal interactions and social functioning.<n>Recent advances in artificial intelligence and social robotics offer new opportunities to examine social anxiety in the human-robot interaction context.<n> Accurate detection of affective states and behaviours associated with social anxiety requires multimodal datasets.<n>This paper presents a protocol for multimodal dataset collection designed to reflect social anxiety in a human-robot interaction context.
arXiv Detail & Related papers (2025-11-17T16:03:33Z)
HHI-Assist: A Dataset and Benchmark of Human-Human Interaction in Physical Assistance Scenario [63.77482302352545]
HHI-Assist is a dataset comprising motion capture clips of human-human interactions in assistive tasks.<n>Our work has the potential to significantly enhance robotic assistance policies.
arXiv Detail & Related papers (2025-09-12T09:38:17Z)
M3PT: A Transformer for Multimodal, Multi-Party Social Signal Prediction with Person-aware Blockwise Attention [13.798471960450323]
Social signals include body pose, head pose, speech, and context-specific activities like acquiring and taking bites of food when dining. We introduce M3PT, a causal transformer architecture with modality and temporal blockwise attention masking to simultaneously process multiple social cues. We demonstrate that using multiple modalities improves bite timing and speaking status prediction.
arXiv Detail & Related papers (2025-01-23T06:42:28Z)
Robot Interaction Behavior Generation based on Social Motion Forecasting for Human-Robot Interaction [9.806227900768926]
We propose to model social motion forecasting in a shared human-robot representation space. ECHO operates in the aforementioned shared space to predict the future motions of the agents encountered in social scenarios. We evaluate our model in multi-person and human-robot motion forecasting tasks and obtain state-of-the-art performance by a large margin.
arXiv Detail & Related papers (2024-02-07T11:37:14Z)
IA-LSTM: Interaction-Aware LSTM for Pedestrian Trajectory Prediction [1.3597551064547502]
Predicting the trajectory of pedestrians in crowd scenarios is indispensable in self-driving or autonomous mobile robot field. Previous researchers focused on how to model human-human interactions but neglected the relative importance of interactions. New mechanism based on correntropy is introduced to measure the relative importance of human-human interactions.
arXiv Detail & Related papers (2023-11-26T05:17:11Z)
InteRACT: Transformer Models for Human Intent Prediction Conditioned on Robot Actions [7.574421886354134]
InteRACT architecture pre-trains a conditional intent prediction model on large human-human datasets and fine-tunes on a small human-robot dataset. We evaluate on a set of real-world collaborative human-robot manipulation tasks and show that our conditional model improves over various marginal baselines.
arXiv Detail & Related papers (2023-11-21T19:15:17Z)
Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots [119.55240471433302]
Habitat 3.0 is a simulation platform for studying collaborative human-robot tasks in home environments. It addresses challenges in modeling complex deformable bodies and diversity in appearance and motion. Human-in-the-loop infrastructure enables real human interaction with simulated robots via mouse/keyboard or a VR interface.
arXiv Detail & Related papers (2023-10-19T17:29:17Z)
SACSoN: Scalable Autonomous Control for Social Navigation [62.59274275261392]
We develop methods for training policies for socially unobtrusive navigation. By minimizing this counterfactual perturbation, we can induce robots to behave in ways that do not alter the natural behavior of humans in the shared space. We collect a large dataset where an indoor mobile robot interacts with human bystanders.
arXiv Detail & Related papers (2023-06-02T19:07:52Z)
It Takes Two: Learning to Plan for Human-Robot Cooperative Carrying [0.6981715773998527]
We present a method for predicting realistic motion plans for cooperative human-robot teams on a table-carrying task. We use a Variational Recurrent Neural Network, VRNN, to model the variation in the trajectory of a human-robot team over time. We show that the model generates more human-like motion compared to a baseline, centralized sampling-based planner.
arXiv Detail & Related papers (2022-09-26T17:59:23Z)
BEHAVE: Dataset and Method for Tracking Human Object Interactions [105.77368488612704]
We present the first full body human- object interaction dataset with multi-view RGBD frames and corresponding 3D SMPL and object fits along with the annotated contacts between them. We use this data to learn a model that can jointly track humans and objects in natural environments with an easy-to-use portable multi-camera setup.
arXiv Detail & Related papers (2022-04-14T13:21:19Z)
Motron: Multimodal Probabilistic Human Motion Forecasting [30.154996245556532]
Motron is a graph-structured model that captures human's multimodality. It outputs deterministic motions and corresponding confidence values for each mode. We demonstrate the performance of our model on several challenging real-world motion forecasting datasets.
arXiv Detail & Related papers (2022-03-08T14:58:41Z)
PHASE: PHysically-grounded Abstract Social Events for Machine Social Perception [50.551003004553806]
We create a dataset of physically-grounded abstract social events, PHASE, that resemble a wide range of real-life social interactions. Phase is validated with human experiments demonstrating that humans perceive rich interactions in the social events. As a baseline model, we introduce a Bayesian inverse planning approach, SIMPLE, which outperforms state-of-the-art feed-forward neural networks.
arXiv Detail & Related papers (2021-03-02T18:44:57Z)
Human Grasp Classification for Reactive Human-to-Robot Handovers [50.91803283297065]
We propose an approach for human-to-robot handovers in which the robot meets the human halfway. We collect a human grasp dataset which covers typical ways of holding objects with various hand shapes and poses. We present a planning and execution approach that takes the object from the human hand according to the detected grasp and hand position.
arXiv Detail & Related papers (2020-03-12T19:58:03Z)
Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works. However, learning a model that captures the dynamics of complex skills represents a major challenge. We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.