Interpretable Visual Reasoning via Induced Symbolic Space
- URL: http://arxiv.org/abs/2011.11603v2
- Date: Tue, 24 Aug 2021 13:55:14 GMT
- Title: Interpretable Visual Reasoning via Induced Symbolic Space
- Authors: Zhonghao Wang, Kai Wang, Mo Yu, Jinjun Xiong, Wen-mei Hwu, Mark
Hasegawa-Johnson, Humphrey Shi
- Abstract summary: We study the problem of concept induction in visual reasoning, i.e., identifying concepts and their hierarchical relationships from question-answer pairs associated with images.
We first design a new framework named object-centric compositional attention model (OCCAM) to perform the visual reasoning task with object-level visual features.
We then come up with a method to induce concepts of objects and relations using clues from the attention patterns between objects' visual features and question words.
- Score: 75.95241948390472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of concept induction in visual reasoning, i.e.,
identifying concepts and their hierarchical relationships from question-answer
pairs associated with images; and achieve an interpretable model via working on
the induced symbolic concept space. To this end, we first design a new
framework named object-centric compositional attention model (OCCAM) to perform
the visual reasoning task with object-level visual features. Then, we come up
with a method to induce concepts of objects and relations using clues from the
attention patterns between objects' visual features and question words.
Finally, we achieve a higher level of interpretability by imposing OCCAM on the
objects represented in the induced symbolic concept space. Our model design
makes this an easy adaption via first predicting the concepts of objects and
relations and then projecting the predicted concepts back to the visual feature
space so the compositional reasoning module can process normally. Experiments
on the CLEVR and GQA datasets demonstrate: 1) our OCCAM achieves a new state of
the art without human-annotated functional programs; 2) our induced concepts
are both accurate and sufficient as OCCAM achieves an on-par performance on
objects represented either in visual features or in the induced symbolic
concept space.
Related papers
- Discovering Conceptual Knowledge with Analytic Ontology Templates for Articulated Objects [42.9186628100765]
We aim to endow machine intelligence with an analogous capability through performing at the conceptual level.
AOT-driven approach yields benefits in three key perspectives.
arXiv Detail & Related papers (2024-09-18T04:53:38Z) - Advancing Ante-Hoc Explainable Models through Generative Adversarial Networks [24.45212348373868]
This paper presents a novel concept learning framework for enhancing model interpretability and performance in visual classification tasks.
Our approach appends an unsupervised explanation generator to the primary classifier network and makes use of adversarial training.
This work presents a significant step towards building inherently interpretable deep vision models with task-aligned concept representations.
arXiv Detail & Related papers (2024-01-09T16:16:16Z) - Text-to-Image Generation for Abstract Concepts [76.32278151607763]
We propose a framework of Text-to-Image generation for Abstract Concepts (TIAC)
The abstract concept is clarified into a clear intent with a detailed definition to avoid ambiguity.
The concept-dependent form is retrieved from an LLM-extracted form pattern set.
arXiv Detail & Related papers (2023-09-26T02:22:39Z) - Concept Bottleneck with Visual Concept Filtering for Explainable Medical
Image Classification [16.849592713393896]
Concept Bottleneck Models (CBMs) enable interpretable image classification by utilizing human-understandable concepts as intermediate targets.
We propose a visual activation score that measures whether the concept contains visual cues or not.
Computed visual activation scores are then used to filter out the less visible concepts, thus resulting in a final concept set with visually meaningful concepts.
arXiv Detail & Related papers (2023-08-23T05:04:01Z) - ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image
Diffusion Models [79.10890337599166]
We introduce ConceptBed, a large-scale dataset that consists of 284 unique visual concepts and 33K composite text prompts.
We evaluate visual concepts that are either objects, attributes, or styles, and also evaluate four dimensions of compositionality: counting, attributes, relations, and actions.
Our results point to a trade-off between learning the concepts and preserving the compositionality which existing approaches struggle to overcome.
arXiv Detail & Related papers (2023-06-07T18:00:38Z) - Intrinsic Physical Concepts Discovery with Object-Centric Predictive
Models [86.25460882547581]
We introduce the PHYsical Concepts Inference NEtwork (PHYCINE), a system that infers physical concepts in different abstract levels without supervision.
We show that object representations containing the discovered physical concepts variables could help achieve better performance in causal reasoning tasks.
arXiv Detail & Related papers (2023-03-03T11:52:21Z) - FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic
descriptions, and Conceptual Relations [99.54048050189971]
We present a framework for learning new visual concepts quickly, guided by multiple naturally occurring data streams.
The learned concepts support downstream applications, such as answering questions by reasoning about unseen images.
We demonstrate the effectiveness of our model on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-03-30T19:45:00Z) - Translational Concept Embedding for Generalized Compositional Zero-shot
Learning [73.60639796305415]
Generalized compositional zero-shot learning means to learn composed concepts of attribute-object pairs in a zero-shot fashion.
This paper introduces a new approach, termed translational concept embedding, to solve these two difficulties in a unified framework.
arXiv Detail & Related papers (2021-12-20T21:27:51Z) - Interactive Disentanglement: Learning Concepts by Interacting with their
Prototype Representations [15.284688801788912]
We show the advantages of prototype representations for understanding and revising the latent space of neural concept learners.
For this purpose, we introduce interactive Concept Swapping Networks (iCSNs)
iCSNs learn to bind conceptual information to specific prototype slots by swapping the latent representations of paired images.
arXiv Detail & Related papers (2021-12-04T09:25:40Z) - Right for the Right Concept: Revising Neuro-Symbolic Concepts by
Interacting with their Explanations [24.327862278556445]
We propose a Neuro-Symbolic scene representation, which allows one to revise the model on the semantic level.
The results of our experiments on CLEVR-Hans demonstrate that our semantic explanations can identify confounders.
arXiv Detail & Related papers (2020-11-25T16:23:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.