Spatiotemporal Pyramidal CNN with Depth-Wise Separable Convolution for
Eye Blinking Detection in the Wild
- URL: http://arxiv.org/abs/2306.11287v1
- Date: Tue, 20 Jun 2023 04:59:09 GMT
- Title: Spatiotemporal Pyramidal CNN with Depth-Wise Separable Convolution for
Eye Blinking Detection in the Wild
- Authors: Lan Anh Thi Nguy, Bach Nguyen Gia, Thanh Tu Thi Nguyen, Kamioka Eiji,
and Tan Xuan Phan
- Abstract summary: Eye blinking detection plays an essential role in deception detection, driving fatigue detection, etc.
Two problems are addressed: how the eye blinking detection model can learn efficiently from different resolutions of eye pictures in diverse conditions; and how to reduce the size of the detection model for faster inference time.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Eye blinking detection in the wild plays an essential role in deception
detection, driving fatigue detection, etc. Despite the fact that numerous
attempts have already been made, the majority of them have encountered
difficulties, such as the derived eye images having different resolutions as
the distance between the face and the camera changes; or the requirement of a
lightweight detection model to obtain a short inference time in order to
perform in real-time. In this research, two problems are addressed: how the eye
blinking detection model can learn efficiently from different resolutions of
eye pictures in diverse conditions; and how to reduce the size of the detection
model for faster inference time. We propose to utilize upsampling and
downsampling the input eye images to the same resolution as one potential
solution for the first problem, then find out which interpolation method can
result in the highest performance of the detection model. For the second
problem, although a recent spatiotemporal convolutional neural network used for
eye blinking detection has a strong capacity to extract both spatial and
temporal characteristics, it remains having a high number of network
parameters, leading to high inference time. Therefore, using Depth-wise
Separable Convolution rather than conventional convolution layers inside each
branch is considered in this paper as a feasible solution.
Related papers
- Learning to Make Keypoints Sub-Pixel Accurate [80.55676599677824]
This work addresses the challenge of sub-pixel accuracy in detecting 2D local features.
We propose a novel network that enhances any detector with sub-pixel precision by learning an offset vector for detected features.
arXiv Detail & Related papers (2024-07-16T12:39:56Z) - Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations.
In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z) - Learning to search for and detect objects in foveal images using deep
learning [3.655021726150368]
This study employs a fixation prediction model that emulates human objective-guided attention of searching for a given class in an image.
The foveated pictures at each fixation point are then classified to determine whether the target is present or absent in the scene.
We present a novel dual task model capable of performing fixation prediction and detection simultaneously, allowing knowledge transfer between the two tasks.
arXiv Detail & Related papers (2023-04-12T09:50:25Z) - Scene Change Detection Using Multiscale Cascade Residual Convolutional
Neural Networks [0.0]
Scene change detection is an image processing problem related to partitioning pixels of a digital image into foreground and background regions.
In this work, we propose a novel Multiscale Residual Processing Module, with a Convolutional Neural Network that integrates a Residual Processing Module.
Experiments conducted on two different datasets support the overall effectiveness of the proposed approach, achieving an average overall effectiveness of $boldsymbol0.9622$ and $boldsymbol0.9664$ over Change Detection 2014 and PetrobrasROUTES datasets respectively.
arXiv Detail & Related papers (2022-12-20T16:48:51Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - dual unet:a novel siamese network for change detection with cascade
differential fusion [4.651756476458979]
We propose a novel Siamese neural network for change detection task, namely Dual-UNet.
In contrast to previous individually encoded the bitemporal images, we design an encoder differential-attention module to focus on the spatial difference relationships of pixels.
Experiments demonstrate that the proposed approach consistently outperforms the most advanced methods on popular seasonal change detection datasets.
arXiv Detail & Related papers (2022-08-12T14:24:09Z) - LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution
Homography Estimation [52.63874513999119]
Cross-resolution image alignment is a key problem in multiscale giga photography.
Existing deep homography methods neglecting the explicit formulation of correspondences between them, which leads to degraded accuracy in cross-resolution challenges.
We propose a local transformer network embedded within a multiscale structure to explicitly learn correspondences between the multimodal inputs.
arXiv Detail & Related papers (2021-06-08T02:51:45Z) - M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information.
In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection.
We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z) - Robust Data Hiding Using Inverse Gradient Attention [82.73143630466629]
In the data hiding task, each pixel of cover images should be treated differently since they have divergent tolerabilities.
We propose a novel deep data hiding scheme with Inverse Gradient Attention (IGA), combing the ideas of adversarial learning and attention mechanism.
Empirically, extensive experiments show that the proposed model outperforms the state-of-the-art methods on two prevalent datasets.
arXiv Detail & Related papers (2020-11-21T19:08:23Z) - Multiscale Detection of Cancerous Tissue in High Resolution Slide Scans [0.0]
We present an algorithm for multi-scale tumor (chimeric cell) detection in high resolution slide scans.
Our approach modifies the effective receptive field at different layers in a CNN so that objects with a broad range of varying scales can be detected in a single forward pass.
arXiv Detail & Related papers (2020-10-01T18:56:46Z) - Real Time Multi-Class Object Detection and Recognition Using Vision
Augmentation Algorithm [0.0]
We introduce a novel real time detection algorithm which employs upsampling and skip connection to extract multiscale features at different convolution levels in a learning task.
The detection precision of the model is shown to be higher and faster than that of the state-of-the-art models.
arXiv Detail & Related papers (2020-03-17T01:08:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.