Inference-Time Gaze Refinement for Micro-Expression Recognition: Enhancing Event-Based Eye Tracking with Motion-Aware Post-Processing
- URL: http://arxiv.org/abs/2506.12524v2
- Date: Sat, 21 Jun 2025 11:04:09 GMT
- Title: Inference-Time Gaze Refinement for Micro-Expression Recognition: Enhancing Event-Based Eye Tracking with Motion-Aware Post-Processing
- Authors: Nuwan Bandara, Thivya Kandappu, Archan Misra,
- Abstract summary: Event-based eye tracking holds significant promise for fine-grained cognitive state inference.<n>We introduce a model-agnostic, inference-time refinement framework to enhance the output of existing event-based gaze estimation models.
- Score: 2.5465367830324905
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Event-based eye tracking holds significant promise for fine-grained cognitive state inference, offering high temporal resolution and robustness to motion artifacts, critical features for decoding subtle mental states such as attention, confusion, or fatigue. In this work, we introduce a model-agnostic, inference-time refinement framework designed to enhance the output of existing event-based gaze estimation models without modifying their architecture or requiring retraining. Our method comprises two key post-processing modules: (i) Motion-Aware Median Filtering, which suppresses blink-induced spikes while preserving natural gaze dynamics, and (ii) Optical Flow-Based Local Refinement, which aligns gaze predictions with cumulative event motion to reduce spatial jitter and temporal discontinuities. To complement traditional spatial accuracy metrics, we propose a novel Jitter Metric that captures the temporal smoothness of predicted gaze trajectories based on velocity regularity and local signal complexity. Together, these contributions significantly improve the consistency of event-based gaze signals, making them better suited for downstream tasks such as micro-expression analysis and mind-state decoding. Our results demonstrate consistent improvements across multiple baseline models on controlled datasets, laying the groundwork for future integration with multimodal affect recognition systems in real-world environments.
Related papers
- Learning Unified System Representations for Microservice Tail Latency Prediction [8.532290784939967]
Microservice architectures have become the de facto standard for building scalable cloud-native applications.<n>Traditional approaches often rely on per-request latency metrics, which are highly sensitive to transient noise.<n>We propose USRFNet, a deep learning network that explicitly separates and models traffic-side and resource-side features.
arXiv Detail & Related papers (2025-08-03T07:46:23Z) - Fully Spiking Neural Networks for Unified Frame-Event Object Tracking [11.727693745877486]
Spiking Frame-Event Tracking framework is proposed to fuse frame and event data.<n> RPM eliminates positional bias randomized spatial reorganization and learnable type encoding.<n>STR strategy enforces temporal consistency among template features in latent space.
arXiv Detail & Related papers (2025-05-27T07:53:50Z) - Event Signal Filtering via Probability Flux Estimation [58.31652473933809]
Events offer a novel paradigm for capturing scene dynamics via asynchronous sensing, but their inherent randomness often leads to degraded signal quality.<n>Event signal filtering is thus essential for enhancing fidelity by reducing this internal randomness and ensuring consistent outputs across diverse acquisition conditions.<n>This paper introduces a generative, online filtering framework called Event Density Flow Filter (EDFilter)<n>Experiments validate EDFilter's performance across tasks like event filtering, super-resolution, and direct event-based blob tracking.
arXiv Detail & Related papers (2025-04-10T07:03:08Z) - Adaptive State-Space Mamba for Real-Time Sensor Data Anomaly Detection [2.922256022514318]
We propose an emphAdaptive State-Space Mamba framework for real-time sensor data anomaly detection.<n>Our approach is easily to other time-series tasks that demand rapid and reliable detection capabilities.
arXiv Detail & Related papers (2025-03-26T21:37:48Z) - Deflickering Vision-Based Occupancy Networks through Lightweight Spatio-Temporal Correlation [15.726401007342087]
Vision-based occupancy networks (VONs) provide an end-to-end solution for reconstructing 3D environments in autonomous driving.<n>Recent approaches have incorporated historical data to mitigate the issue, but they often incur high computational costs and may introduce noisy information that interferes with object detection.<n>We propose OccLinker, a novel plugin framework designed to seamlessly integrate with existing VONs for boosting performance.<n>Our method efficiently consolidates historical static and motion cues, learns sparse latent correlations with current features through a dual cross-attention mechanism, and produces correction occupancy components to refine the base network's predictions.
arXiv Detail & Related papers (2025-02-21T13:07:45Z) - Event-Based Tracking Any Point with Motion-Augmented Temporal Consistency [58.719310295870024]
This paper presents an event-based framework for tracking any point.<n>It tackles the challenges posed by spatial sparsity and motion sensitivity in events.<n>It achieves 150% faster processing with competitive model parameters.
arXiv Detail & Related papers (2024-12-02T09:13:29Z) - ESVO2: Direct Visual-Inertial Odometry with Stereo Event Cameras [33.81592783496106]
Event-based visual odometry aims at solving tracking and mapping subproblems (typically in parallel)<n>We build an event-based stereo visual-inertial odometry system on top of a direct pipeline.<n>The resulting system scales well with modern high-resolution event cameras.
arXiv Detail & Related papers (2024-10-12T05:35:27Z) - Kriformer: A Novel Spatiotemporal Kriging Approach Based on Graph Transformers [5.4381914710364665]
This study addresses posed by sparse sensor deployment and unreliable data by framing the problem as an environmental challenge.
A graphkriformer model, Kriformer, estimates data at locations without sensors by mining spatial and temporal correlations, even with limited resources.
arXiv Detail & Related papers (2024-09-23T11:01:18Z) - SFANet: Spatial-Frequency Attention Network for Weather Forecasting [54.470205739015434]
Weather forecasting plays a critical role in various sectors, driving decision-making and risk management.
Traditional methods often struggle to capture the complex dynamics of meteorological systems.
We propose a novel framework designed to address these challenges and enhance the accuracy of weather prediction.
arXiv Detail & Related papers (2024-05-29T08:00:15Z) - Layout Sequence Prediction From Noisy Mobile Modality [53.49649231056857]
Trajectory prediction plays a vital role in understanding pedestrian movement for applications such as autonomous driving and robotics.
Current trajectory prediction models depend on long, complete, and accurately observed sequences from visual modalities.
We propose LTrajDiff, a novel approach that treats objects obstructed or out of sight as equally important as those with fully visible trajectories.
arXiv Detail & Related papers (2023-10-09T20:32:49Z) - Uncovering the Missing Pattern: Unified Framework Towards Trajectory
Imputation and Prediction [60.60223171143206]
Trajectory prediction is a crucial undertaking in understanding entity movement or human behavior from observed sequences.
Current methods often assume that the observed sequences are complete while ignoring the potential for missing values.
This paper presents a unified framework, the Graph-based Conditional Variational Recurrent Neural Network (GC-VRNN), which can perform trajectory imputation and prediction simultaneously.
arXiv Detail & Related papers (2023-03-28T14:27:27Z) - ProgressiveMotionSeg: Mutually Reinforced Framework for Event-Based
Motion Segmentation [101.19290845597918]
This paper presents a Motion Estimation (ME) module and an Event Denoising (ED) module jointly optimized in a mutually reinforced manner.
Taking temporal correlation as guidance, ED module calculates the confidence that each event belongs to real activity events, and transmits it to ME module to update energy function of motion segmentation for noise suppression.
arXiv Detail & Related papers (2022-03-22T13:40:26Z) - Asynchronous Optimisation for Event-based Visual Odometry [53.59879499700895]
Event cameras open up new possibilities for robotic perception due to their low latency and high dynamic range.
We focus on event-based visual odometry (VO)
We propose an asynchronous structure-from-motion optimisation back-end.
arXiv Detail & Related papers (2022-03-02T11:28:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.