Temporal Flow Mask Attention for Open-Set Long-Tailed Recognition of
Wild Animals in Camera-Trap Images
- URL: http://arxiv.org/abs/2208.14625v1
- Date: Wed, 31 Aug 2022 04:15:17 GMT
- Title: Temporal Flow Mask Attention for Open-Set Long-Tailed Recognition of
Wild Animals in Camera-Trap Images
- Authors: Jeongsoo Kim, Sangmin Woo, Byeongjun Park, Changick Kim
- Abstract summary: We propose the Temporal Flow Mask Attention Network to tackle the open-set long-tailed recognition problem.
We extract temporal features of sequential frames using the optical flow module and learn informative representation using attention residual blocks.
We show that applying the meta-embedding technique boosts the performance of the method in open-set long-tailed recognition.
- Score: 21.473296246163443
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Camera traps, unmanned observation devices, and deep learning-based image
recognition systems have greatly reduced human effort in collecting and
analyzing wildlife images. However, data collected via above apparatus exhibits
1) long-tailed and 2) open-ended distribution problems. To tackle the open-set
long-tailed recognition problem, we propose the Temporal Flow Mask Attention
Network that comprises three key building blocks: 1) an optical flow module, 2)
an attention residual module, and 3) a meta-embedding classifier. We extract
temporal features of sequential frames using the optical flow module and learn
informative representation using attention residual blocks. Moreover, we show
that applying the meta-embedding technique boosts the performance of the method
in open-set long-tailed recognition. We apply this method on a Korean
Demilitarized Zone (DMZ) dataset. We conduct extensive experiments, and
quantitative and qualitative analyses to prove that our method effectively
tackles the open-set long-tailed recognition problem while being robust to
unknown classes.
Related papers
- Improving Object Detection for Time-Lapse Imagery Using Temporal Features in Wildlife Monitoring [0.5580662655439501]
We show that performance of an object detector in a single frame of a time-lapse sequence can be improved by including-temporal features from the prior frames.
We propose a method that leverages temporal information by integrating two additional spatial feature channels which capture stationary and non-stationary elements of the scene.
arXiv Detail & Related papers (2024-12-20T20:37:09Z) - Flow-Attention-based Spatio-Temporal Aggregation Network for 3D Mask
Detection [12.160085404239446]
We propose a novel 3D mask detection framework called FASTEN.
We tailor the network for focusing more on fine details in large movements, which can eliminate redundant-temporal feature interference.
FASTEN only requires five frames input and outperforms eight competitors for both intra-dataset and cross-dataset evaluations.
arXiv Detail & Related papers (2023-10-25T11:54:21Z) - Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images
with Free Attention Masks [64.67735676127208]
Text-to-image diffusion models have shown great potential for benefiting image recognition.
Although promising, there has been inadequate exploration dedicated to unsupervised learning on diffusion-generated images.
We introduce customized solutions by fully exploiting the aforementioned free attention masks.
arXiv Detail & Related papers (2023-08-13T10:07:46Z) - Motion-inductive Self-supervised Object Discovery in Videos [99.35664705038728]
We propose a model for processing consecutive RGB frames, and infer the optical flow between any pair of frames using a layered representation.
We demonstrate superior performance over previous state-of-the-art methods on three public video segmentation datasets.
arXiv Detail & Related papers (2022-10-01T08:38:28Z) - Active Gaze Control for Foveal Scene Exploration [124.11737060344052]
We propose a methodology to emulate how humans and robots with foveal cameras would explore a scene.
The proposed method achieves an increase in detection F1-score of 2-3 percentage points for the same number of gaze shifts.
arXiv Detail & Related papers (2022-08-24T14:59:28Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - Learnable Multi-level Frequency Decomposition and Hierarchical Attention
Mechanism for Generalized Face Presentation Attack Detection [7.324459578044212]
Face presentation attack detection (PAD) is attracting a lot of attention and playing a key role in securing face recognition systems.
We propose a dual-stream convolution neural networks (CNNs) framework to deal with unseen scenarios.
We successfully prove the design of our proposed PAD solution in a step-wise ablation study.
arXiv Detail & Related papers (2021-09-16T13:06:43Z) - Detection of Deepfake Videos Using Long Distance Attention [73.6659488380372]
Most existing detection methods treat the problem as a vanilla binary classification problem.
In this paper, the problem is treated as a special fine-grained classification problem since the differences between fake and real faces are very subtle.
A spatial-temporal model is proposed which has two components for capturing spatial and temporal forgery traces in global perspective.
arXiv Detail & Related papers (2021-06-24T08:33:32Z) - Self-Supervised Multi-Frame Monocular Scene Flow [61.588808225321735]
We introduce a multi-frame monocular scene flow network based on self-supervised learning.
We observe state-of-the-art accuracy among monocular scene flow methods based on self-supervised learning.
arXiv Detail & Related papers (2021-05-05T17:49:55Z) - Automatic Detection and Recognition of Individuals in Patterned Species [4.163860911052052]
We develop a framework for automatic detection and recognition of individuals in different patterned species.
We use the recently proposed Faster-RCNN object detection framework to efficiently detect animals in images.
We evaluate our recognition system on zebra and jaguar images to show generalization to other patterned species.
arXiv Detail & Related papers (2020-05-06T15:29:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.