Self-supervised co-salient object detection via feature correspondence at multiple scales
- URL: http://arxiv.org/abs/2403.11107v3
- Date: Wed, 3 Jul 2024 06:10:52 GMT
- Title: Self-supervised co-salient object detection via feature correspondence at multiple scales
- Authors: Souradeep Chakraborty, Dimitris Samaras,
- Abstract summary: This paper introduces a novel two-stage self-supervised approach for detecting co-occurring salient objects (CoSOD) in image groups without requiring segmentation annotations.
We train a self-supervised network that detects co-salient regions by computing local patch-level feature correspondences across images.
In experiments on three CoSOD benchmark datasets, our model outperforms the corresponding state-of-the-art models by a huge margin.
- Score: 27.664016341526988
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Our paper introduces a novel two-stage self-supervised approach for detecting co-occurring salient objects (CoSOD) in image groups without requiring segmentation annotations. Unlike existing unsupervised methods that rely solely on patch-level information (e.g. clustering patch descriptors) or on computation heavy off-the-shelf components for CoSOD, our lightweight model leverages feature correspondences at both patch and region levels, significantly improving prediction performance. In the first stage, we train a self-supervised network that detects co-salient regions by computing local patch-level feature correspondences across images. We obtain the segmentation predictions using confidence-based adaptive thresholding. In the next stage, we refine these intermediate segmentations by eliminating the detected regions (within each image) whose averaged feature representations are dissimilar to the foreground feature representation averaged across all the cross-attention maps (from the previous stage). Extensive experiments on three CoSOD benchmark datasets show that our self-supervised model outperforms the corresponding state-of-the-art models by a huge margin (e.g. on the CoCA dataset, our model has a 13.7% F-measure gain over the SOTA unsupervised CoSOD model). Notably, our self-supervised model also outperforms several recent fully supervised CoSOD models on the three test datasets (e.g., on the CoCA dataset, our model has a 4.6% F-measure gain over a recent supervised CoSOD model).
Related papers
- Improving EO Foundation Models with Confidence Assessment for enhanced Semantic segmentation [0.0]
We develop a Confidence Assessment for enhanced Semantic segmentation (CAS) model.
It evaluates confidence at both the segment and pixel levels, providing both labels and confidence scores as output.
This work has significant applications, particularly in evaluating EO Foundation Models on semantic segmentation downstream tasks.
arXiv Detail & Related papers (2024-06-26T12:05:49Z) - Unsupervised and semi-supervised co-salient object detection via
segmentation frequency statistics [24.397029434125873]
We develop an unsupervised method to detect co-occurring salient objects in an image group.
For the first time, we show that a large unlabeled dataset e.g. ImageNet-1k can be effectively leveraged to improve unsupervised CoSOD performance.
To avoid propagating erroneous signals from predictions on unlabeled data, we propose a confidence estimation module.
arXiv Detail & Related papers (2023-11-11T19:47:16Z) - GEO-Bench: Toward Foundation Models for Earth Monitoring [139.77907168809085]
We propose a benchmark comprised of six classification and six segmentation tasks.
This benchmark will be a driver of progress across a variety of Earth monitoring tasks.
arXiv Detail & Related papers (2023-06-06T16:16:05Z) - K-means Clustering Based Feature Consistency Alignment for Label-free
Model Evaluation [12.295565506212844]
This paper presents our solutions for the 1st DataCV Challenge of the Visual Understanding dataset workshop at CVPR 2023.
Firstly, we propose a novel method called K-means Clustering Based Feature Consistency Alignment (KCFCA), which is tailored to handle the distribution shifts of various datasets.
Secondly, we develop a dynamic regression model to capture the relationship between the shifts in distribution and model accuracy.
Thirdly, we design an algorithm to discover the outlier model factors, eliminate the outlier models, and combine the strengths of multiple autoeval models.
arXiv Detail & Related papers (2023-04-17T06:33:30Z) - Cross-Modal Fine-Tuning: Align then Refine [83.37294254884446]
ORCA is a cross-modal fine-tuning framework that extends the applicability of a single large-scale pretrained model to diverse modalities.
We show that ORCA obtains state-of-the-art results on 3 benchmarks containing over 60 datasets from 12 modalities.
arXiv Detail & Related papers (2023-02-11T16:32:28Z) - Generalised Co-Salient Object Detection [50.876864826216924]
We propose a new setting that relaxes an assumption in the conventional Co-Salient Object Detection (CoSOD) setting.
We call this new setting Generalised Co-Salient Object Detection (GCoSOD)
We propose a novel random sampling based Generalised CoSOD Training (GCT) strategy to distill the awareness of inter-image absence of co-salient objects into CoSOD models.
arXiv Detail & Related papers (2022-08-20T12:23:32Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Guided Point Contrastive Learning for Semi-supervised Point Cloud
Semantic Segmentation [90.2445084743881]
We present a method for semi-supervised point cloud semantic segmentation to adopt unlabeled point clouds in training to boost the model performance.
Inspired by the recent contrastive loss in self-supervised tasks, we propose the guided point contrastive loss to enhance the feature representation and model generalization ability.
arXiv Detail & Related papers (2021-10-15T16:38:54Z) - Parameter Decoupling Strategy for Semi-supervised 3D Left Atrium
Segmentation [0.0]
We present a novel semi-supervised segmentation model based on parameter decoupling strategy to encourage consistent predictions from diverse views.
Our method has achieved a competitive result over the state-of-the-art semisupervised methods on the Atrial Challenge dataset.
arXiv Detail & Related papers (2021-09-20T14:51:42Z) - Estimating the Robustness of Classification Models by the Structure of
the Learned Feature-Space [10.418647759223964]
We argue that fixed testsets are only able to capture a small portion of possible data variations and are thus limited and prone to generate new overfitted solutions.
To overcome these drawbacks, we suggest to estimate the robustness of a model directly from the structure of its learned feature-space.
arXiv Detail & Related papers (2021-06-23T10:52:29Z) - Explanation-Guided Training for Cross-Domain Few-Shot Classification [96.12873073444091]
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets.
We introduce a novel training approach for existing FSC models.
We show that explanation-guided training effectively improves the model generalization.
arXiv Detail & Related papers (2020-07-17T07:28:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.