A Likelihood Ratio-Based Approach to Segmenting Unknown Objects
- URL: http://arxiv.org/abs/2409.06424v1
- Date: Tue, 10 Sep 2024 11:10:32 GMT
- Title: A Likelihood Ratio-Based Approach to Segmenting Unknown Objects
- Authors: Nazir Nayal, Youssef Shoeb, Fatma Güney,
- Abstract summary: Outlier supervision is a widely used strategy for improving OoD detection of the existing segmentation networks.
We propose an adaptive, lightweight unknown estimation module (UEM) for outlier supervision.
Our approach achieves a new state-of-the-art across multiple datasets, outperforming the previous best method by 5.74% average precision points.
- Score: 4.000869978312742
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Addressing the Out-of-Distribution (OoD) segmentation task is a prerequisite for perception systems operating in an open-world environment. Large foundational models are frequently used in downstream tasks, however, their potential for OoD remains mostly unexplored. We seek to leverage a large foundational model to achieve robust representation. Outlier supervision is a widely used strategy for improving OoD detection of the existing segmentation networks. However, current approaches for outlier supervision involve retraining parts of the original network, which is typically disruptive to the model's learned feature representation. Furthermore, retraining becomes infeasible in the case of large foundational models. Our goal is to retrain for outlier segmentation without compromising the strong representation space of the foundational model. To this end, we propose an adaptive, lightweight unknown estimation module (UEM) for outlier supervision that significantly enhances the OoD segmentation performance without affecting the learned feature representation of the original network. UEM learns a distribution for outliers and a generic distribution for known classes. Using the learned distributions, we propose a likelihood-ratio-based outlier scoring function that fuses the confidence of UEM with that of the pixel-wise segmentation inlier network to detect unknown objects. We also propose an objective to optimize this score directly. Our approach achieves a new state-of-the-art across multiple datasets, outperforming the previous best method by 5.74% average precision points while having a lower false-positive rate. Importantly, strong inlier performance remains unaffected.
Related papers
- VL4AD: Vision-Language Models Improve Pixel-wise Anomaly Detection [5.66050466694651]
We propose Vision-Language (VL) encoders into existing anomaly detectors to leverage the semantically broad VL pre-training for improved outlier awareness.
We also propose a new scoring function that enables data- and training-free outlier supervision via textual prompts.
The resulting VL4AD model achieves competitive performance on widely used benchmark datasets.
arXiv Detail & Related papers (2024-09-25T20:12:10Z) - ACTRESS: Active Retraining for Semi-supervised Visual Grounding [52.08834188447851]
A previous study, RefTeacher, makes the first attempt to tackle this task by adopting the teacher-student framework to provide pseudo confidence supervision and attention-based supervision.
This approach is incompatible with current state-of-the-art visual grounding models, which follow the Transformer-based pipeline.
Our paper proposes the ACTive REtraining approach for Semi-Supervised Visual Grounding, abbreviated as ACTRESS.
arXiv Detail & Related papers (2024-07-03T16:33:31Z) - Learning to Generate Training Datasets for Robust Semantic Segmentation [37.9308918593436]
We propose a novel approach to improve the robustness of semantic segmentation techniques.
We design Robusta, a novel conditional generative adversarial network to generate realistic and plausible perturbed images.
Our results suggest that this approach could be valuable in safety-critical applications.
arXiv Detail & Related papers (2023-08-01T10:02:26Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Robust Outlier Rejection for 3D Registration with Variational Bayes [70.98659381852787]
We develop a novel variational non-local network-based outlier rejection framework for robust alignment.
We propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.
arXiv Detail & Related papers (2023-04-04T03:48:56Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Constraining Pseudo-label in Self-training Unsupervised Domain
Adaptation with Energy-based Model [26.074500538428364]
unsupervised domain adaptation (UDA) is developed to introduce the knowledge in the labeled source domain to the unlabeled target domain.
Recently, deep self-training presents a powerful means for UDA, involving an iterative process of predicting the target domain.
We resort to the energy-based model and constrain the training of the unlabeled target sample with an energy function minimization objective.
arXiv Detail & Related papers (2022-08-26T22:50:23Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - TraND: Transferable Neighborhood Discovery for Unsupervised Cross-domain
Gait Recognition [77.77786072373942]
This paper proposes a Transferable Neighborhood Discovery (TraND) framework to bridge the domain gap for unsupervised cross-domain gait recognition.
We design an end-to-end trainable approach to automatically discover the confident neighborhoods of unlabeled samples in the latent space.
Our method achieves state-of-the-art results on two public datasets, i.e., CASIA-B and OU-LP.
arXiv Detail & Related papers (2021-02-09T03:07:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.