Video Salient Object Detection via Adaptive Local-Global Refinement
- URL: http://arxiv.org/abs/2104.14360v1
- Date: Thu, 29 Apr 2021 14:14:11 GMT
- Title: Video Salient Object Detection via Adaptive Local-Global Refinement
- Authors: Yi Tang and Yuanman Li and Guoliang Xing
- Abstract summary: Video salient object detection (VSOD) is an important task in many vision applications.
We propose an adaptive local-global refinement framework for VSOD.
We show that our weighting methodology can further exploit the feature correlations, thus driving the network to learn more discriminative feature representation.
- Score: 7.723369608197167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video salient object detection (VSOD) is an important task in many vision
applications. Reliable VSOD requires to simultaneously exploit the information
from both the spatial domain and the temporal domain. Most of the existing
algorithms merely utilize simple fusion strategies, such as addition and
concatenation, to merge the information from different domains. Despite their
simplicity, such fusion strategies may introduce feature redundancy, and also
fail to fully exploit the relationship between multi-level features extracted
from both spatial and temporal domains. In this paper, we suggest an adaptive
local-global refinement framework for VSOD. Different from previous approaches,
we propose a local refinement architecture and a global one to refine the
simply fused features with different scopes, which can fully explore the local
dependence and the global dependence of multi-level features. In addition, to
emphasize the effective information and suppress the useless one, an adaptive
weighting mechanism is designed based on graph convolutional neural network
(GCN). We show that our weighting methodology can further exploit the feature
correlations, thus driving the network to learn more discriminative feature
representation. Extensive experimental results on public video datasets
demonstrate the superiority of our method over the existing ones.
Related papers
- Generalizable Deepfake Detection via Effective Local-Global Feature Extraction [5.221473306027505]
GANs and diffusion models have led to the generation of increasingly realistic fake images.
Deepfake detection has become a pressing issue in today's world.
We propose a novel method that effectively combines local and global features.
arXiv Detail & Related papers (2025-01-25T15:53:57Z) - Object Style Diffusion for Generalized Object Detection in Urban Scene [69.04189353993907]
We introduce a novel single-domain object detection generalization method, named GoDiff.
By integrating pseudo-target domain data with source domain data, we diversify the training dataset.
Experimental results demonstrate that our method not only enhances the generalization ability of existing detectors but also functions as a plug-and-play enhancement for other single-domain generalization methods.
arXiv Detail & Related papers (2024-12-18T13:03:00Z) - GLCONet: Learning Multi-source Perception Representation for Camouflaged Object Detection [23.872633359324098]
We propose a novel Global-Local Collaborative Optimization Network, called GLCONet.
In this paper, we first design a collaborative optimization strategy to simultaneously model the local details and global long-range relationships.
Experiments demonstrate that the proposed GLCONet method with different backbones can effectively activate potentially significant pixels in an image.
arXiv Detail & Related papers (2024-09-15T02:26:17Z) - Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework [51.26566634946208]
We introduce smileGeo, a novel visual geo-localization framework.
By inter-agent communication, smileGeo integrates the inherent knowledge of these agents with additional retrieved information.
Results show that our approach significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2024-08-21T03:31:30Z) - CLIP the Gap: A Single Domain Generalization Approach for Object
Detection [60.20931827772482]
Single Domain Generalization tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain.
We propose to leverage a pre-trained vision-language model to introduce semantic domain concepts via textual prompts.
We achieve this via a semantic augmentation strategy acting on the features extracted by the detector backbone, as well as a text-based classification loss.
arXiv Detail & Related papers (2023-01-13T12:01:18Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Relation Matters: Foreground-aware Graph-based Relational Reasoning for
Domain Adaptive Object Detection [81.07378219410182]
We propose a new and general framework for DomainD, named Foreground-aware Graph-based Reasoning (FGRR)
FGRR incorporates graph structures into the detection pipeline to explicitly model the intra- and inter-domain foreground object relations.
Empirical results demonstrate that the proposed FGRR exceeds the state-of-the-art on four DomainD benchmarks.
arXiv Detail & Related papers (2022-06-06T05:12:48Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z) - Hybrid Multiple Attention Network for Semantic Segmentation in Aerial
Images [24.35779077001839]
We propose a novel attention-based framework named Hybrid Multiple Attention Network (HMANet) to adaptively capture global correlations.
We introduce a simple yet effective region shuffle attention (RSA) module to reduce feature redundant and improve the efficiency of self-attention mechanism.
arXiv Detail & Related papers (2020-01-09T07:47:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.