MAC-Gaze: Motion-Aware Continual Calibration for Mobile Gaze Tracking
- URL: http://arxiv.org/abs/2505.22769v3
- Date: Thu, 05 Jun 2025 16:49:14 GMT
- Title: MAC-Gaze: Motion-Aware Continual Calibration for Mobile Gaze Tracking
- Authors: Yaxiong Lei, Mingyue Zhao, Yuheng Wang, Shijing He, Yusuke Sugano, Mohamed Khamis, Juan Ye,
- Abstract summary: Mobile gaze tracking faces a fundamental challenge: maintaining accuracy as users naturally change postures and device orientations.<n>We present MAC-Gaze, a Motion-Aware continual approach that leverages smartphone Inertial measurement unit (IMU) sensors and continual learning techniques.<n>Our system integrates a pre-trained visual gaze estimator and an IMU-based activity recognition model with a clustering-based hybrid decision-making mechanism.
- Score: 19.082459585817716
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Mobile gaze tracking faces a fundamental challenge: maintaining accuracy as users naturally change their postures and device orientations. Traditional calibration approaches, like one-off, fail to adapt to these dynamic conditions, leading to degraded performance over time. We present MAC-Gaze, a Motion-Aware continual Calibration approach that leverages smartphone Inertial measurement unit (IMU) sensors and continual learning techniques to automatically detect changes in user motion states and update the gaze tracking model accordingly. Our system integrates a pre-trained visual gaze estimator and an IMU-based activity recognition model with a clustering-based hybrid decision-making mechanism that triggers recalibration when motion patterns deviate significantly from previously encountered states. To enable accumulative learning of new motion conditions while mitigating catastrophic forgetting, we employ replay-based continual learning, allowing the model to maintain performance across previously encountered motion conditions. We evaluate our system through extensive experiments on the publicly available RGBDGaze dataset and our own 10-hour multimodal MotionGaze dataset (481K+ images, 800K+ IMU readings), encompassing a wide range of postures under various motion conditions including sitting, standing, lying, and walking. Results demonstrate that our method reduces gaze estimation error by 19.9% on RGBDGaze (from 1.73 cm to 1.41 cm) and by 31.7% on MotionGaze (from 2.81 cm to 1.92 cm) compared to traditional calibration approaches. Our framework provides a robust solution for maintaining gaze estimation accuracy in mobile scenarios.
Related papers
- BaroPoser: Real-time Human Motion Tracking from IMUs and Barometers in Everyday Devices [12.374794959250828]
We present BaroPoser, the first method that combines IMU and barometric data recorded by a smartphone and a smartwatch to estimate human pose and global translation in real time.<n>By leveraging barometric readings, we estimate sensor height changes, which provide valuable cues for both improving the accuracy of human pose estimation and predicting global translation on non-flat terrain.
arXiv Detail & Related papers (2025-08-05T10:46:59Z) - Planar Velocity Estimation for Fast-Moving Mobile Robots Using Event-Based Optical Flow [1.4447019135112429]
We introduce an approach to velocity estimation that is decoupled from wheel-to-surface traction assumptions.<n>The proposed method is evaluated through in-field experiments on a 1:10 scale autonomous racing platform.
arXiv Detail & Related papers (2025-05-16T11:00:33Z) - Quantifying the Impact of Motion on 2D Gaze Estimation in Real-World Mobile Interactions [18.294511216241805]
This paper provides empirical evidence on how user mobility and behaviour affect mobile gaze tracking accuracy.<n>Head distance, head pose, and device orientation are key factors affecting accuracy.<n>Findings highlight the need for more robust, adaptive eye-tracking systems.
arXiv Detail & Related papers (2025-02-14T21:44:52Z) - Event-Based Tracking Any Point with Motion-Augmented Temporal Consistency [58.719310295870024]
This paper presents an event-based framework for tracking any point.<n>It tackles the challenges posed by spatial sparsity and motion sensitivity in events.<n>It achieves 150% faster processing with competitive model parameters.
arXiv Detail & Related papers (2024-12-02T09:13:29Z) - COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation [98.05046790227561]
COIN is a control-inpainting motion diffusion prior that enables fine-grained control to disentangle human and camera motions.
COIN outperforms the state-of-the-art methods in terms of global human motion estimation and camera motion estimation.
arXiv Detail & Related papers (2024-08-29T10:36:29Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - Improving Unsupervised Video Object Segmentation with Motion-Appearance
Synergy [52.03068246508119]
We present IMAS, a method that segments the primary objects in videos without manual annotation in training or inference.
IMAS achieves Improved UVOS with Motion-Appearance Synergy.
We demonstrate its effectiveness in tuning critical hyperparams previously tuned with human annotation or hand-crafted hyperparam-specific metrics.
arXiv Detail & Related papers (2022-12-17T06:47:30Z) - Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular
Depth Estimation by Integrating IMU Motion Dynamics [74.1720528573331]
Unsupervised monocular depth and ego-motion estimation has drawn extensive research attention in recent years.
We propose DynaDepth, a novel scale-aware framework that integrates information from vision and IMU motion dynamics.
We validate the effectiveness of DynaDepth by conducting extensive experiments and simulations on the KITTI and Make3D datasets.
arXiv Detail & Related papers (2022-07-11T07:50:22Z) - Transformer Inertial Poser: Attention-based Real-time Human Motion
Reconstruction from Sparse IMUs [79.72586714047199]
We propose an attention-based deep learning method to reconstruct full-body motion from six IMU sensors in real-time.
Our method achieves new state-of-the-art results both quantitatively and qualitatively, while being simple to implement and smaller in size.
arXiv Detail & Related papers (2022-03-29T16:24:52Z) - MotionHint: Self-Supervised Monocular Visual Odometry with Motion
Constraints [70.76761166614511]
We present a novel self-supervised algorithm named MotionHint for monocular visual odometry (VO)
Our MotionHint algorithm can be easily applied to existing open-sourced state-of-the-art SSM-VO systems.
arXiv Detail & Related papers (2021-09-14T15:35:08Z) - Event Based, Near Eye Gaze Tracking Beyond 10,000Hz [41.23347304960948]
We propose a hybrid frame-event-based near-eye gaze tracking system with update rates beyond 10,000 Hz.
Our system builds on emerging event cameras that simultaneously acquire regularly sampled frames and adaptively sampled events.
We hope to enable a new generation of ultra-low-latency gaze-contingent rendering and display techniques for virtual and augmented reality.
arXiv Detail & Related papers (2020-04-07T17:57:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.