SocialEyes: Scaling mobile eye-tracking to multi-person social settings
- URL: http://arxiv.org/abs/2407.06345v3
- Date: Fri, 13 Dec 2024 21:15:29 GMT
- Title: SocialEyes: Scaling mobile eye-tracking to multi-person social settings
- Authors: Shreshth Saxena, Areez Visram, Neil Lobo, Zahid Mirza, Mehak Rafi Khan, Biranugan Pirabaharan, Alexander Nguyen, Lauren K. Fink,
- Abstract summary: We developed a system to stream, record, and analyse synchronised data from multiple mobile eye-tracking devices during collective viewing experiences.
We tested the system in a live concert and a film screening with 30 simultaneous viewers during each of two public events (N=60)
Our novel analysis metrics and visualizations illustrate the potential of collective eye-tracking data for understanding collaborative behaviour and social interaction.
- Score: 34.82692226532414
- License:
- Abstract: Eye movements provide a window into human behaviour, attention, and interaction dynamics. Challenges in real-world, multi-person environments have, however, restrained eye-tracking research predominantly to single-person, in-lab settings. We developed a system to stream, record, and analyse synchronised data from multiple mobile eye-tracking devices during collective viewing experiences (e.g., concerts, films, lectures). We implemented lightweight operator interfaces for real-time-monitoring, remote-troubleshooting, and gaze-projection from individual egocentric perspectives to a common coordinate space for shared gaze analysis. We tested the system in a live concert and a film screening with 30 simultaneous viewers during each of two public events (N=60). We observe precise time-synchronisation between devices measured through recorded clock-offsets, and accurate gaze-projection in challenging dynamic scenes. Our novel analysis metrics and visualizations illustrate the potential of collective eye-tracking data for understanding collaborative behaviour and social interaction. This advancement promotes ecological validity in eye-tracking research and paves the way for innovative interactive tools.
Related papers
- Temporally Consistent Dynamic Scene Graphs: An End-to-End Approach for Action Tracklet Generation [1.6584112749108326]
TCDSG, Temporally Consistent Dynamic Scene Graphs, is an end-to-end framework that detects, tracks, and links subject-object relationships across time.
Our work sets a new standard in multi-frame video analysis, opening new avenues for high-impact applications in surveillance, autonomous navigation, and beyond.
arXiv Detail & Related papers (2024-12-03T20:19:20Z) - 3D Gaze Tracking for Studying Collaborative Interactions in Mixed-Reality Environments [3.8075244788223044]
This study presents a novel framework for 3D gaze tracking tailored for mixed-reality settings.
Our proposed framework leverages state-of-the-art computer vision and machine learning techniques to overcome obstacles.
arXiv Detail & Related papers (2024-06-16T16:30:56Z) - I-MPN: Inductive Message Passing Network for Efficient Human-in-the-Loop Annotation of Mobile Eye Tracking Data [4.487146086221174]
We present a novel human-centered learning algorithm designed for automated object recognition within mobile eye-tracking settings.
Our approach seamlessly integrates an object detector with a spatial relation-aware inductive message-passing network (I-MPN), harnessing node profile information and capturing object correlations.
arXiv Detail & Related papers (2024-06-10T13:08:31Z) - Realtime Dynamic Gaze Target Tracking and Depth-Level Estimation [6.435984242701043]
Transparent Displays (TD) in various applications, such as Heads-Up Displays (HUDs) in vehicles, is a burgeoning field, poised to revolutionize user experiences.
This innovation brings forth significant challenges in realtime human-device interaction, particularly in accurately identifying and tracking a user's gaze on dynamically changing TDs.
We present a two-fold robust and efficient systematic solution for realtime gaze monitoring, comprised of: (1) a tree-based algorithm for identifying and dynamically tracking gaze targets; and (2) a multi-stream self-attention architecture to estimate the depth-level of human gaze from eye tracking data.
arXiv Detail & Related papers (2024-06-09T20:52:47Z) - Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph
Generation [64.85974098314344]
Video scene graph generation (VidSGG) aims to identify objects in visual scenes and infer their relationships for a given video.
Inherently, object pairs and their relationships enjoy spatial co-occurrence correlations within each image and temporal consistency/transition correlations across different images.
We propose a spatial-temporal knowledge-embedded transformer (STKET) that incorporates the prior spatial-temporal knowledge into the multi-head cross-attention mechanism.
arXiv Detail & Related papers (2023-09-23T02:40:28Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Weakly Supervised Human-Object Interaction Detection in Video via
Contrastive Spatiotemporal Regions [81.88294320397826]
A system does not know what human-object interactions are present in a video as or the actual location of the human and object.
We introduce a dataset comprising over 6.5k videos with human-object interaction that have been curated from sentence captions.
We demonstrate improved performance over weakly supervised baselines adapted to our annotations on our video dataset.
arXiv Detail & Related papers (2021-10-07T15:30:18Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - iGibson, a Simulation Environment for Interactive Tasks in Large
Realistic Scenes [54.04456391489063]
iGibson is a novel simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes.
Our environment contains fifteen fully interactive home-sized scenes populated with rigid and articulated objects.
iGibson features enable the generalization of navigation agents, and that the human-iGibson interface and integrated motion planners facilitate efficient imitation learning of simple human demonstrated behaviors.
arXiv Detail & Related papers (2020-12-05T02:14:17Z) - Automated analysis of eye-tracker-based human-human interaction studies [2.433293618209319]
We investigate which state-of-the-art computer vision algorithms may be used to automate the post-analysis of mobile eye-tracking data.
For the case study in this paper, we focus on mobile eye-tracker recordings made during human-human face-to-face interactions.
We show that the use of this single-pipeline framework provides robust results, which are both more accurate and faster than previous work in the field.
arXiv Detail & Related papers (2020-07-09T10:00:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.