Improving Object Detection for Time-Lapse Imagery Using Temporal Features in Wildlife Monitoring
- URL: http://arxiv.org/abs/2412.16329v1
- Date: Fri, 20 Dec 2024 20:37:09 GMT
- Title: Improving Object Detection for Time-Lapse Imagery Using Temporal Features in Wildlife Monitoring
- Authors: Marcus Jenkins, Kirsty A. Franklin, Malcolm A. C. Nicoll, Nik C. Cole, Kevin Ruhomaun, Vikash Tatayah, Michal Mackiewicz,
- Abstract summary: We show that performance of an object detector in a single frame of a time-lapse sequence can be improved by including-temporal features from the prior frames.
We propose a method that leverages temporal information by integrating two additional spatial feature channels which capture stationary and non-stationary elements of the scene.
- Score: 0.5580662655439501
- License:
- Abstract: Monitoring animal populations is crucial for assessing the health of ecosystems. Traditional methods, which require extensive fieldwork, are increasingly being supplemented by time-lapse camera-trap imagery combined with an automatic analysis of the image data. The latter usually involves some object detector aimed at detecting relevant targets (commonly animals) in each image, followed by some postprocessing to gather activity and population data. In this paper, we show that the performance of an object detector in a single frame of a time-lapse sequence can be improved by including spatio-temporal features from the prior frames. We propose a method that leverages temporal information by integrating two additional spatial feature channels which capture stationary and non-stationary elements of the scene and consequently improve scene understanding and reduce the number of stationary false positives. The proposed technique achieves a significant improvement of 24\% in mean average precision (mAP@0.05:0.95) over the baseline (temporal feature-free, single frame) object detector on a large dataset of breeding tropical seabirds. We envisage our method will be widely applicable to other wildlife monitoring applications that use time-lapse imaging.
Related papers
- Enhancing Lidar-based Object Detection in Adverse Weather using Offset
Sequences in Time [1.1725016312484975]
Lidar-based object detection is significantly affected by adverse weather conditions such as rain and fog.
Our research provides a comprehensive study of effective methods for mitigating the effects of adverse weather on the reliability of lidar-based object detection.
arXiv Detail & Related papers (2024-01-17T08:31:58Z) - Removing Human Bottlenecks in Bird Classification Using Camera Trap
Images and Deep Learning [0.14746127876003345]
Monitoring bird populations is essential for ecologists.
Technology such as camera traps, acoustic monitors and drones provide methods for non-invasive monitoring.
There are two main problems with using camera traps for monitoring: a) cameras generate many images, making it difficult to process and analyse the data in a timely manner.
In this paper, we outline an approach for overcoming these issues by utilising deep learning for real-time classi-fication of bird species.
arXiv Detail & Related papers (2023-05-03T13:04:39Z) - TempNet: Temporal Attention Towards the Detection of Animal Behaviour in
Videos [63.85815474157357]
We propose an efficient computer vision- and deep learning-based method for the detection of biological behaviours in videos.
TempNet uses an encoder bridge and residual blocks to maintain model performance with a two-staged, spatial, then temporal, encoder.
We demonstrate its application to the detection of sablefish (Anoplopoma fimbria) startle events.
arXiv Detail & Related papers (2022-11-17T23:55:12Z) - Temporal Flow Mask Attention for Open-Set Long-Tailed Recognition of
Wild Animals in Camera-Trap Images [21.473296246163443]
We propose the Temporal Flow Mask Attention Network to tackle the open-set long-tailed recognition problem.
We extract temporal features of sequential frames using the optical flow module and learn informative representation using attention residual blocks.
We show that applying the meta-embedding technique boosts the performance of the method in open-set long-tailed recognition.
arXiv Detail & Related papers (2022-08-31T04:15:17Z) - Active Gaze Control for Foveal Scene Exploration [124.11737060344052]
We propose a methodology to emulate how humans and robots with foveal cameras would explore a scene.
The proposed method achieves an increase in detection F1-score of 2-3 percentage points for the same number of gaze shifts.
arXiv Detail & Related papers (2022-08-24T14:59:28Z) - SatMAE: Pre-training Transformers for Temporal and Multi-Spectral
Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE)
To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Content-Based Detection of Temporal Metadata Manipulation [91.34308819261905]
We propose an end-to-end approach to verify whether the purported time of capture of an image is consistent with its content and geographic location.
The central idea is the use of supervised consistency verification, in which we predict the probability that the image content, capture time, and geographical location are consistent.
Our approach improves upon previous work on a large benchmark dataset, increasing the classification accuracy from 59.03% to 81.07%.
arXiv Detail & Related papers (2021-03-08T13:16:19Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z) - Automatic Detection and Recognition of Individuals in Patterned Species [4.163860911052052]
We develop a framework for automatic detection and recognition of individuals in different patterned species.
We use the recently proposed Faster-RCNN object detection framework to efficiently detect animals in images.
We evaluate our recognition system on zebra and jaguar images to show generalization to other patterned species.
arXiv Detail & Related papers (2020-05-06T15:29:21Z) - Deep Reinforcement Learning for Active Human Pose Estimation [35.229529080763925]
We introduce Pose-DRL, a fully trainable deep reinforcement learning-based active pose estimation architecture.
We show that our model learns to select viewpoints that yield significantly more accurate pose estimates compared to strong multi-view baselines.
arXiv Detail & Related papers (2020-01-07T13:35:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.