Conterfactual Generative Zero-Shot Semantic Segmentation
- URL: http://arxiv.org/abs/2106.06360v1
- Date: Fri, 11 Jun 2021 13:01:03 GMT
- Title: Conterfactual Generative Zero-Shot Semantic Segmentation
- Authors: Feihong Shen and Jun Liu and Ping Hu
- Abstract summary: One of the popular zero-shot semantic segmentation methods is based on the generative model.
In this work, we consider counterfactual methods to avoid the confounder in the original model.
Our model is compared with baseline models on two real-world datasets.
- Score: 16.684570608930983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: zero-shot learning is an essential part of computer vision. As a classical
downstream task, zero-shot semantic segmentation has been studied because of
its applicant value. One of the popular zero-shot semantic segmentation methods
is based on the generative model Most new proposed works added structures on
the same architecture to enhance this model. However, we found that, from the
view of causal inference, the result of the original model has been influenced
by spurious statistical relationships. Thus the performance of the prediction
shows severe bias. In this work, we consider counterfactual methods to avoid
the confounder in the original model. Based on this method, we proposed a new
framework for zero-shot semantic segmentation. Our model is compared with
baseline models on two real-world datasets, Pascal-VOC and Pascal-Context. The
experiment results show proposed models can surpass previous confounded models
and can still make use of additional structures to improve the performance. We
also design a simple structure based on Graph Convolutional Networks (GCN) in
this work.
Related papers
- Embedding-based statistical inference on generative models [10.948308354932639]
We extend results related to embedding-based representations of generative models to classical statistical inference settings.
We demonstrate that using the perspective space as the basis of a notion of "similar" is effective for multiple model-level inference tasks.
arXiv Detail & Related papers (2024-10-01T22:28:39Z) - Enabling Small Models for Zero-Shot Classification through Model Label Learning [50.68074833512999]
We introduce a novel paradigm, Model Label Learning (MLL), which bridges the gap between models and their functionalities.
Experiments on seven real-world datasets validate the effectiveness and efficiency of MLL.
arXiv Detail & Related papers (2024-08-21T09:08:26Z) - Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models [38.16654407693728]
We introduce a new robustness measurement that directly measures the image classification model's performance compared with a surrogate oracle.
Our new method will offer us a new way to evaluate the models' robustness performance, free of limitations of fixed benchmarks or constrained perturbations.
arXiv Detail & Related papers (2023-08-21T11:07:27Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - Comparing Foundation Models using Data Kernels [13.099029073152257]
We present a methodology for directly comparing the embedding space geometry of foundation models.
Our methodology is grounded in random graph theory and enables valid hypothesis testing of embedding similarity.
We show how our framework can induce a manifold of models equipped with a distance function that correlates strongly with several downstream metrics.
arXiv Detail & Related papers (2023-05-09T02:01:07Z) - Enhancing Continual Relation Extraction via Classifier Decomposition [30.88081408988638]
Continual relation extraction models aim at handling emerging new relations while avoiding forgetting old ones in the streaming data.
Most models only adopt a vanilla strategy when models first learn representations of new relations.
We propose a simple yet effective classifier decomposition framework that splits the last FFN layer into separated previous and current classifiers.
arXiv Detail & Related papers (2023-05-08T11:29:33Z) - Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks.
Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts.
Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z) - Unsupervised Deep Learning Meets Chan-Vese Model [77.24463525356566]
We propose an unsupervised image segmentation approach that integrates the Chan-Vese (CV) model with deep neural networks.
Our basic idea is to apply a deep neural network that maps the image into a latent space to alleviate the violation of the piecewise constant assumption in image space.
arXiv Detail & Related papers (2022-04-14T13:23:57Z) - A Simple Baseline for Zero-shot Semantic Segmentation with Pre-trained
Vision-language Model [61.58071099082296]
It is unclear how to make zero-shot recognition working well on broader vision problems, such as object detection and semantic segmentation.
In this paper, we target for zero-shot semantic segmentation, by building it on an off-the-shelf pre-trained vision-language model, i.e., CLIP.
Our experimental results show that this simple framework surpasses previous state-of-the-arts by a large margin.
arXiv Detail & Related papers (2021-12-29T18:56:18Z) - Distributional Depth-Based Estimation of Object Articulation Models [21.046351215949525]
We propose a method that efficiently learns distributions over articulation model parameters directly from depth images.
Our core contributions include a novel representation for distributions over rigid body transformations.
We introduce a novel deep learning based approach, DUST-net, that performs category-independent articulation model estimation.
arXiv Detail & Related papers (2021-08-12T17:44:51Z) - Improving Label Quality by Jointly Modeling Items and Annotators [68.8204255655161]
We propose a fully Bayesian framework for learning ground truth labels from noisy annotators.
Our framework ensures scalability by factoring a generative, Bayesian soft clustering model over label distributions into the classic David and Skene joint annotator-data model.
arXiv Detail & Related papers (2021-06-20T02:15:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.