Making Heads or Tails: Towards Semantically Consistent Visual
Counterfactuals
- URL: http://arxiv.org/abs/2203.12892v1
- Date: Thu, 24 Mar 2022 07:26:11 GMT
- Title: Making Heads or Tails: Towards Semantically Consistent Visual
Counterfactuals
- Authors: Simon Vandenhende, Dhruv Mahajan, Filip Radenovic and Deepti
Ghadiyaram
- Abstract summary: A visual counterfactual explanation replaces image regions in a query image with regions from a distractor image such that the system's decision on the transformed image changes to the distractor class.
We present a novel framework for computing visual counterfactual explanations based on two key ideas.
- Score: 31.375504774744268
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A visual counterfactual explanation replaces image regions in a query image
with regions from a distractor image such that the system's decision on the
transformed image changes to the distractor class. In this work, we present a
novel framework for computing visual counterfactual explanations based on two
key ideas. First, we enforce that the \textit{replaced} and \textit{replacer}
regions contain the same semantic part, resulting in more semantically
consistent explanations. Second, we use multiple distractor images in a
computationally efficient way and obtain more discriminative explanations with
fewer region replacements. Our approach is $\mathbf{27\%}$ more semantically
consistent and an order of magnitude faster than a competing method on three
fine-grained image recognition datasets. We highlight the utility of our
counterfactuals over existing works through machine teaching experiments where
we teach humans to classify different bird species. We also complement our
explanations with the vocabulary of parts and attributes that contributed the
most to the system's decision. In this task as well, we obtain state-of-the-art
results when using our counterfactual explanations relative to existing works,
reinforcing the importance of semantically consistent explanations.
Related papers
- Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning [71.14084801851381]
Change captioning aims to succinctly describe the semantic change between a pair of similar images.
Most existing methods directly capture the difference between them, which risk obtaining error-prone difference features.
We propose a distractors-immune representation learning network that correlates the corresponding channels of two image representations.
arXiv Detail & Related papers (2024-07-16T13:00:33Z) - Composed Image Retrieval for Remote Sensing [24.107610091033997]
This work introduces composed image retrieval to remote sensing.
It allows to query a large image archive by image examples alternated by a textual description.
A novel method fusing image-to-image and text-to-image similarity is introduced.
arXiv Detail & Related papers (2024-05-24T14:18:31Z) - Towards Image Semantics and Syntax Sequence Learning [8.033697392628424]
We introduce the concept of "image grammar", consisting of "image semantics" and "image syntax"
We propose a weakly supervised two-stage approach to learn the image grammar relative to a class of visual objects/scenes.
Our framework is trained to reason over patch semantics and detect faulty syntax.
arXiv Detail & Related papers (2024-01-31T00:16:02Z) - Superpixel Semantics Representation and Pre-training for Vision-Language Task [11.029236633301222]
coarse-grained semantic interactions in image space should not be ignored.
This paper proposes superpixels as comprehensive and robust visual primitives.
It allows parsing the entire image as a fine-to-coarse visual hierarchy.
arXiv Detail & Related papers (2023-10-20T12:26:04Z) - Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic
Image Synthesis [139.2216271759332]
We propose a novel ECGAN for the challenging semantic image synthesis task.
The semantic labels do not provide detailed structural information, making it challenging to synthesize local details and structures.
The widely adopted CNN operations such as convolution, down-sampling, and normalization usually cause spatial resolution loss.
We propose a novel contrastive learning method, which aims to enforce pixel embeddings belonging to the same semantic class to generate more similar image content.
arXiv Detail & Related papers (2023-07-22T14:17:19Z) - Shrinking the Semantic Gap: Spatial Pooling of Local Moment Invariants
for Copy-Move Forgery Detection [7.460203098159187]
Copy-move forgery is a manipulation of copying and pasting specific patches from and to an image, with potentially illegal or unethical uses.
Recent advances in the forensic methods for copy-move forgery have shown increasing success in detection accuracy and robustness.
For images with high self-similarity or strong signal corruption, the existing algorithms often exhibit inefficient processes and unreliable results.
arXiv Detail & Related papers (2022-07-19T09:11:43Z) - Knowledge Mining with Scene Text for Fine-Grained Recognition [53.74297368412834]
We propose an end-to-end trainable network that mines implicit contextual knowledge behind scene text image.
We employ KnowBert to retrieve relevant knowledge for semantic representation and combine it with image features for fine-grained classification.
Our method outperforms the state-of-the-art by 3.72% mAP and 5.39% mAP, respectively.
arXiv Detail & Related papers (2022-03-27T05:54:00Z) - Distributed Attention for Grounded Image Captioning [55.752968732796354]
We study the problem of weakly supervised grounded image captioning.
The goal is to automatically generate a sentence describing the context of the image with each noun word grounded to the corresponding region in the image.
arXiv Detail & Related papers (2021-08-02T17:28:33Z) - This is not the Texture you are looking for! Introducing Novel
Counterfactual Explanations for Non-Experts using Generative Adversarial
Learning [59.17685450892182]
counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image.
We present a novel approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques.
Our results show that our approach leads to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems.
arXiv Detail & Related papers (2020-12-22T10:08:05Z) - Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis [194.1452124186117]
We propose a novel ECGAN for the challenging semantic image synthesis task.
Our ECGAN achieves significantly better results than state-of-the-art methods.
arXiv Detail & Related papers (2020-03-31T01:23:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.