Green Video Camouflaged Object Detection
- URL: http://arxiv.org/abs/2501.10914v1
- Date: Sun, 19 Jan 2025 01:42:00 GMT
- Title: Green Video Camouflaged Object Detection
- Authors: Xinyu Wang, Hong-Shuo Chen, Zhiruo Zhou, Suya You, Azad M. Madni, C. -C. Jay Kuo,
- Abstract summary: We propose a green VCOD method named GreenVCOD to handle temporal information.
Built upon a green ICOD method, GreenVCOD uses long- and short-term temporal neighborhoods to capture joint spatial/temporal context information.
Experimental results show that GreenVCOD offers competitive performance compared to state-of-the-art VCOD benchmarks.
- Score: 28.528114525671025
- License:
- Abstract: Camouflaged object detection (COD) aims to distinguish hidden objects embedded in an environment highly similar to the object. Conventional video-based COD (VCOD) methods explicitly extract motion cues or employ complex deep learning networks to handle the temporal information, which is limited by high complexity and unstable performance. In this work, we propose a green VCOD method named GreenVCOD. Built upon a green ICOD method, GreenVCOD uses long- and short-term temporal neighborhoods (TN) to capture joint spatial/temporal context information for decision refinement. Experimental results show that GreenVCOD offers competitive performance compared to state-of-the-art VCOD benchmarks.
Related papers
- MSVCOD:A Large-Scale Multi-Scene Dataset for Video Camouflage Object Detection [23.59587900985667]
Video Camouflaged Object Detection (VCOD) is a challenging task which aims to identify objects that seamlessly concealed within the background in videos.
We construct a new large-scale multi-domain VCOD dataset MSVCOD.
Our MSVCOD is the largest VCOD dataset to date, introducing multiple object categories including human, animal, medical, and vehicle objects for the first time.
Our framework achieves state-of-the-art results on the existing VCOD animal dataset and the proposed MSVCOD.
arXiv Detail & Related papers (2025-02-19T16:27:23Z) - Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation [49.113131249753714]
We propose an efficient algorithm, termed MTNet, which concurrently exploits motion and temporal cues.
MTNet is devised by effectively merging appearance and motion features during the feature extraction process within encoders.
We employ a cascade of decoders all feature levels across all feature levels to optimally exploit the derived features.
arXiv Detail & Related papers (2025-01-14T03:15:46Z) - Detecting Every Object from Events [24.58024539462497]
We propose Detecting Every Object in Events (DEOE), an approach tailored for achieving high-speed, class-agnostic open-world object detection in event-based vision.
Our code is available at https://github.com/Hatins/DEOE.
arXiv Detail & Related papers (2024-04-08T08:20:53Z) - Explicit Motion Handling and Interactive Prompting for Video Camouflaged
Object Detection [23.059829327898818]
Existing video camouflaged object detection approaches take noisy motion estimation as input or model motion implicitly.
We propose a novel Explicit Motion handling and Interactive Prompting framework for VCOD, dubbed EMIP, which handles motion cues explicitly.
EMIP is characterized by a two-stream architecture for simultaneously conducting camouflaged segmentation and optical flow estimation.
arXiv Detail & Related papers (2024-03-04T12:11:07Z) - Frequency Perception Network for Camouflaged Object Detection [51.26386921922031]
We propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain.
Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage.
Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets.
arXiv Detail & Related papers (2023-08-17T11:30:46Z) - Learning Video Salient Object Detection Progressively from Unlabeled
Videos [8.224670666756193]
We propose a novel VSOD method via a progressive framework that locates and segments salient objects in sequence without utilizing any video annotation.
Specifically, an algorithm for generating deeptemporal location labels, which consists of generating high-saliency location labels and tracking salient objects in adjacent frames, is proposed.
Although our method does not require labeled video at all, the experimental results on five public benchmarks of DAVIS, FBMS, ViSal, VOS, and DAVSOD demonstrate that our proposed method is competitive with fully supervised methods and outperforms the state-of-the-art weakly and unsupervised methods.
arXiv Detail & Related papers (2022-04-05T06:12:45Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Confidence-guided Adaptive Gate and Dual Differential Enhancement for
Video Salient Object Detection [47.68968739917077]
Video salient object detection (VSOD) aims to locate and segment the most attractive object by exploiting both spatial cues and temporal cues hidden in video sequences.
We propose a new framework to adaptively capture available information from spatial and temporal cues, which contains Confidence-guided Adaptive Gate (CAG) modules and Dual Differential Enhancement (DDE) modules.
arXiv Detail & Related papers (2021-05-14T08:49:37Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.