Onfocus Detection: Identifying Individual-Camera Eye Contact from
Unconstrained Images
- URL: http://arxiv.org/abs/2103.15307v1
- Date: Mon, 29 Mar 2021 03:29:09 GMT
- Title: Onfocus Detection: Identifying Individual-Camera Eye Contact from
Unconstrained Images
- Authors: Dingwen Zhang, Bo Wang, Gerong Wang, Qiang Zhang, Jiajia Zhang,
Jungong Han, Zheng You
- Abstract summary: Onfocus detection aims at identifying whether the focus of the individual captured by a camera is on the camera or not.
We build a large-scale onfocus detection dataset, named as the OnFocus Detection In the Wild (OFDIW)
We propose a novel end-to-end deep model, i.e., the eye-context interaction inferring network (ECIIN) for onfocus detection.
- Score: 81.64699115587167
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Onfocus detection aims at identifying whether the focus of the individual
captured by a camera is on the camera or not. Based on the behavioral research,
the focus of an individual during face-to-camera communication leads to a
special type of eye contact, i.e., the individual-camera eye contact, which is
a powerful signal in social communication and plays a crucial role in
recognizing irregular individual status (e.g., lying or suffering mental
disease) and special purposes (e.g., seeking help or attracting fans). Thus,
developing effective onfocus detection algorithms is of significance for
assisting the criminal investigation, disease discovery, and social behavior
analysis. However, the review of the literature shows that very few efforts
have been made toward the development of onfocus detector due to the lack of
large-scale public available datasets as well as the challenging nature of this
task. To this end, this paper engages in the onfocus detection research by
addressing the above two issues. Firstly, we build a large-scale onfocus
detection dataset, named as the OnFocus Detection In the Wild (OFDIW). It
consists of 20,623 images in unconstrained capture conditions (thus called ``in
the wild'') and contains individuals with diverse emotions, ages, facial
characteristics, and rich interactions with surrounding objects and background
scenes. On top of that, we propose a novel end-to-end deep model, i.e., the
eye-context interaction inferring network (ECIIN), for onfocus detection, which
explores eye-context interaction via dynamic capsule routing. Finally,
comprehensive experiments are conducted on the proposed OFDIW dataset to
benchmark the existing learning models and demonstrate the effectiveness of the
proposed ECIIN. The project (containing both datasets and codes) is at
https://github.com/wintercho/focus.
Related papers
- A Review of Human-Object Interaction Detection [6.1941885271010175]
Human-object interaction (HOI) detection plays a key role in high-level visual understanding.
This paper systematically summarizes and discusses the recent work in image-based HOI detection.
arXiv Detail & Related papers (2024-08-20T08:32:39Z) - Exploring Predicate Visual Context in Detecting Human-Object
Interactions [44.937383506126274]
We study how best to re-introduce image features via cross-attention.
Our model with enhanced predicate visual context (PViC) outperforms state-of-the-art methods on the HICO-DET and V-COCO benchmarks.
arXiv Detail & Related papers (2023-08-11T15:57:45Z) - Self-supervised Interest Point Detection and Description for Fisheye and
Perspective Images [7.451395029642832]
Keypoint detection and matching is a fundamental task in many computer vision problems.
In this work, we focus on the case when this is caused by the geometry of the cameras used for image acquisition.
We build on a state-of-the-art approach and derive a self-supervised procedure that enables training an interest point detector and descriptor network.
arXiv Detail & Related papers (2023-06-02T22:39:33Z) - BI AVAN: Brain inspired Adversarial Visual Attention Network [67.05560966998559]
We propose a brain-inspired adversarial visual attention network (BI-AVAN) to characterize human visual attention directly from functional brain activity.
Our model imitates the biased competition process between attention-related/neglected objects to identify and locate the visual objects in a movie frame the human brain focuses on in an unsupervised manner.
arXiv Detail & Related papers (2022-10-27T22:20:36Z) - Do Pedestrians Pay Attention? Eye Contact Detection in the Wild [75.54077277681353]
In urban environments, humans rely on eye contact for fast and efficient communication with nearby people.
In this paper, we focus on eye contact detection in the wild, i.e., real-world scenarios for autonomous vehicles with no control over the environment or the distance of pedestrians.
We introduce a model that leverages semantic keypoints to detect eye contact and show that this high-level representation achieves state-of-the-art results on the publicly-available dataset JAAD.
To study domain adaptation, we create LOOK: a large-scale dataset for eye contact detection in the wild, which focuses on diverse and un
arXiv Detail & Related papers (2021-12-08T10:21:28Z) - One-Shot Object Affordance Detection in the Wild [76.46484684007706]
Affordance detection refers to identifying the potential action possibilities of objects in an image.
We devise a One-Shot Affordance Detection Network (OSAD-Net) that estimates the human action purpose and then transfers it to help detect the common affordance from all candidate images.
With complex scenes and rich annotations, our PADv2 dataset can be used as a test bed to benchmark affordance detection methods.
arXiv Detail & Related papers (2021-08-08T14:53:10Z) - Defocus Blur Detection via Salient Region Detection Prior [11.5253648614748]
Defocus blur Detection aims to separate the out-of-focus and depth-of-field areas in photos.
We propose a novel network for defocus blur detection.
arXiv Detail & Related papers (2020-11-19T05:56:11Z) - Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets)
Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network"
Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.