Systematic Visual Reasoning through Object-Centric Relational
Abstraction
- URL: http://arxiv.org/abs/2306.02500v2
- Date: Fri, 10 Nov 2023 22:22:44 GMT
- Title: Systematic Visual Reasoning through Object-Centric Relational
Abstraction
- Authors: Taylor W. Webb, Shanka Subhra Mondal, Jonathan D. Cohen
- Abstract summary: We introduce OCRA, a model that extracts explicit representations of both objects and abstract relations.
It achieves strong systematic generalizations in tasks involving complex visual displays.
- Score: 5.914610036560008
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human visual reasoning is characterized by an ability to identify abstract
patterns from only a small number of examples, and to systematically generalize
those patterns to novel inputs. This capacity depends in large part on our
ability to represent complex visual inputs in terms of both objects and
relations. Recent work in computer vision has introduced models with the
capacity to extract object-centric representations, leading to the ability to
process multi-object visual inputs, but falling short of the systematic
generalization displayed by human reasoning. Other recent models have employed
inductive biases for relational abstraction to achieve systematic
generalization of learned abstract rules, but have generally assumed the
presence of object-focused inputs. Here, we combine these two approaches,
introducing Object-Centric Relational Abstraction (OCRA), a model that extracts
explicit representations of both objects and abstract relations, and achieves
strong systematic generalization in tasks (including a novel dataset,
CLEVR-ART, with greater visual complexity) involving complex visual displays.
Related papers
- VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning [86.59849798539312]
We present Neuro-Symbolic Predicates, a first-order abstraction language that combines the strengths of symbolic and neural knowledge representations.
We show that our approach offers better sample complexity, stronger out-of-distribution generalization, and improved interpretability.
arXiv Detail & Related papers (2024-10-30T16:11:05Z) - Abstraction Alignment: Comparing Model and Human Conceptual Relationships [26.503178592074757]
We introduce abstraction alignment, a methodology to measure the agreement between a model's learned abstraction and the expected human abstraction.
In evaluation tasks, abstraction alignment provides a deeper understanding of model behavior and dataset content.
arXiv Detail & Related papers (2024-07-17T13:27:26Z) - Automatic Discovery of Visual Circuits [66.99553804855931]
We explore scalable methods for extracting the subgraph of a vision model's computational graph that underlies recognition of a specific visual concept.
We find that our approach extracts circuits that causally affect model output, and that editing these circuits can defend large pretrained models from adversarial attacks.
arXiv Detail & Related papers (2024-04-22T17:00:57Z) - Slot Abstractors: Toward Scalable Abstract Visual Reasoning [5.262577780347204]
We propose Slot Abstractors, an approach to abstract visual reasoning that can be scaled to problems involving a large number of objects and multiple relations among them.
The approach displays state-of-the-art performance across four abstract visual reasoning tasks, as well as an abstract reasoning task involving real-world images.
arXiv Detail & Related papers (2024-03-06T04:49:02Z) - Emergence and Function of Abstract Representations in Self-Supervised
Transformers [0.0]
We study the inner workings of small-scale transformers trained to reconstruct partially masked visual scenes.
We show that the network develops intermediate abstract representations, or abstractions, that encode all semantic features of the dataset.
Using precise manipulation experiments, we demonstrate that abstractions are central to the network's decision-making process.
arXiv Detail & Related papers (2023-12-08T20:47:15Z) - FACT: Learning Governing Abstractions Behind Integer Sequences [7.895232155155041]
We introduce a novel view on the learning of concepts admitting complete finitary descriptions.
We lay down a set of benchmarking tasks aimed at conceptual understanding by machine learning models.
To further aid research in knowledge representation and reasoning, we present FACT, the Finitary Abstraction Toolkit.
arXiv Detail & Related papers (2022-09-20T08:20:03Z) - Abstract Interpretation for Generalized Heuristic Search in Model-Based
Planning [50.96320003643406]
Domain-general model-based planners often derive their generality by constructing searchs through the relaxation of symbolic world models.
We illustrate how abstract interpretation can serve as a unifying framework for these abstractions, extending the reach of search to richer world models.
Theses can also be integrated with learning, allowing agents to jumpstart planning in novel world models via abstraction-derived information.
arXiv Detail & Related papers (2022-08-05T00:22:11Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z) - Constellation: Learning relational abstractions over objects for
compositional imagination [64.99658940906917]
We introduce Constellation, a network that learns relational abstractions of static visual scenes.
This work is a first step in the explicit representation of visual relationships and using them for complex cognitive procedures.
arXiv Detail & Related papers (2021-07-23T11:59:40Z) - Hierarchical Relational Inference [80.00374471991246]
We propose a novel approach to physical reasoning that models objects as hierarchies of parts that may locally behave separately, but also act more globally as a single whole.
Unlike prior approaches, our method learns in an unsupervised fashion directly from raw visual images.
It explicitly distinguishes multiple levels of abstraction and improves over a strong baseline at modeling synthetic and real-world videos.
arXiv Detail & Related papers (2020-10-07T20:19:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.