Rotation-Constrained Cross-View Feature Fusion for Multi-View
Appearance-based Gaze Estimation
- URL: http://arxiv.org/abs/2305.12704v3
- Date: Wed, 15 Nov 2023 09:02:23 GMT
- Title: Rotation-Constrained Cross-View Feature Fusion for Multi-View
Appearance-based Gaze Estimation
- Authors: Yoichiro Hisadome, Tianyi Wu, Jiawei Qin, Yusuke Sugano
- Abstract summary: This work proposes a generalizable multi-view gaze estimation task and a cross-view feature fusion method to address this issue.
In addition to paired images, our method takes the relative rotation matrix between two cameras as additional input.
The proposed network learns to extract rotatable feature representation by using relative rotation as a constraint.
- Score: 16.43119580796718
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Appearance-based gaze estimation has been actively studied in recent years.
However, its generalization performance for unseen head poses is still a
significant limitation for existing methods. This work proposes a generalizable
multi-view gaze estimation task and a cross-view feature fusion method to
address this issue. In addition to paired images, our method takes the relative
rotation matrix between two cameras as additional input. The proposed network
learns to extract rotatable feature representation by using relative rotation
as a constraint and adaptively fuses the rotatable features via stacked fusion
modules. This simple yet efficient approach significantly improves
generalization performance under unseen head poses without significantly
increasing computational cost. The model can be trained with random
combinations of cameras without fixing the positioning and can generalize to
unseen camera pairs during inference. Through experiments using multiple
datasets, we demonstrate the advantage of the proposed method over baseline
methods, including state-of-the-art domain generalization approaches. The code
will be available at https://github.com/ut-vision/Rot-MVGaze.
Related papers
- UFORecon: Generalizable Sparse-View Surface Reconstruction from Arbitrary and UnFavOrable Sets [20.767590006724117]
We introduce and validate a view-combination score to indicate the effectiveness of the input view combination.
To achieve this, we apply cross-view matching transformers to model interactions between source images and build correlation frustums.
Our proposed framework significantly outperforms previous methods in terms of view-combination generalizability.
arXiv Detail & Related papers (2024-03-08T06:27:13Z) - PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle
Adjustment [21.98302129015761]
We propose to formulate the Structure from Motion (SfM) problem inside a probabilistic diffusion framework.
We show that our method PoseDiffusion significantly improves over the classic SfM pipelines.
It is observed that our method can generalize across datasets without further training.
arXiv Detail & Related papers (2023-06-27T17:59:07Z) - Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem.
In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images.
The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z) - RelPose: Predicting Probabilistic Relative Rotation for Single Objects
in the Wild [73.1276968007689]
We describe a data-driven method for inferring the camera viewpoints given multiple images of an arbitrary object.
We show that our approach outperforms state-of-the-art SfM and SLAM methods given sparse images on both seen and unseen categories.
arXiv Detail & Related papers (2022-08-11T17:59:59Z) - Unsupervised Image Fusion Method based on Feature Mutual Mapping [16.64607158983448]
We propose an unsupervised adaptive image fusion method to address the above issues.
We construct a global map to measure the connections of pixels between the input source images.
Our method achieves superior performance in both visual perception and objective evaluation.
arXiv Detail & Related papers (2022-01-25T07:50:14Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z) - AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in
the Wild [77.43884383743872]
We present AdaFuse, an adaptive multiview fusion method to enhance the features in occluded views.
We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic.
We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints.
arXiv Detail & Related papers (2020-10-26T03:19:46Z) - Deep Keypoint-Based Camera Pose Estimation with Geometric Constraints [80.60538408386016]
Estimating relative camera poses from consecutive frames is a fundamental problem in visual odometry.
We propose an end-to-end trainable framework consisting of learnable modules for detection, feature extraction, matching and outlier rejection.
arXiv Detail & Related papers (2020-07-29T21:41:31Z) - Object-Centric Multi-View Aggregation [86.94544275235454]
We present an approach for aggregating a sparse set of views of an object in order to compute a semi-implicit 3D representation in the form of a volumetric feature grid.
Key to our approach is an object-centric canonical 3D coordinate system into which views can be lifted, without explicit camera pose estimation.
We show that computing a symmetry-aware mapping from pixels to the canonical coordinate system allows us to better propagate information to unseen regions.
arXiv Detail & Related papers (2020-07-20T17:38:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.