Emergence and Function of Abstract Representations in Self-Supervised
Transformers
- URL: http://arxiv.org/abs/2312.05361v1
- Date: Fri, 8 Dec 2023 20:47:15 GMT
- Title: Emergence and Function of Abstract Representations in Self-Supervised
Transformers
- Authors: Quentin RV. Ferry, Joshua Ching, Takashi Kawai
- Abstract summary: We study the inner workings of small-scale transformers trained to reconstruct partially masked visual scenes.
We show that the network develops intermediate abstract representations, or abstractions, that encode all semantic features of the dataset.
Using precise manipulation experiments, we demonstrate that abstractions are central to the network's decision-making process.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human intelligence relies in part on our brains' ability to create abstract
mental models that succinctly capture the hidden blueprint of our reality. Such
abstract world models notably allow us to rapidly navigate novel situations by
generalizing prior knowledge, a trait deep learning systems have historically
struggled to replicate. However, the recent shift from supervised to
self-supervised objectives, combined with expressive transformer-based
architectures, have yielded powerful foundation models that appear to learn
versatile representations that can support a wide range of downstream tasks.
This promising development raises the intriguing possibility of such models
developing in silico abstract world models. We test this hypothesis by studying
the inner workings of small-scale transformers trained to reconstruct partially
masked visual scenes generated from a simple blueprint. We show that the
network develops intermediate abstract representations, or abstractions, that
encode all semantic features of the dataset. These abstractions manifest as
low-dimensional manifolds where the embeddings of semantically related tokens
transiently converge, thus allowing for the generalization of downstream
computations. Using precise manipulation experiments, we demonstrate that
abstractions are central to the network's decision-making process. Our research
also suggests that these abstractions are compositionally structured,
exhibiting features like contextual independence and part-whole relationships
that mirror the compositional nature of the dataset. Finally, we introduce a
Language-Enhanced Architecture (LEA) designed to encourage the network to
articulate its computations. We find that LEA develops an abstraction-centric
language that can be easily interpreted, allowing us to more readily access and
steer the network's decision-making process.
Related papers
- VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning [86.59849798539312]
We present Neuro-Symbolic Predicates, a first-order abstraction language that combines the strengths of symbolic and neural knowledge representations.
We show that our approach offers better sample complexity, stronger out-of-distribution generalization, and improved interpretability.
arXiv Detail & Related papers (2024-10-30T16:11:05Z) - ARPA: A Novel Hybrid Model for Advancing Visual Word Disambiguation Using Large Language Models and Transformers [1.6541870997607049]
We present ARPA, an architecture that fuses the unparalleled contextual understanding of large language models with the advanced feature extraction capabilities of transformers.
ARPA's introduction marks a significant milestone in visual word disambiguation, offering a compelling solution.
We invite researchers and practitioners to explore the capabilities of our model, envisioning a future where such hybrid models drive unprecedented advancements in artificial intelligence.
arXiv Detail & Related papers (2024-08-12T10:15:13Z) - AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph [62.685920585838616]
abstraction ability is essential in human intelligence, which remains under-explored in language models.
We present AbsPyramid, a unified entailment graph of 221K textual descriptions of abstraction knowledge.
arXiv Detail & Related papers (2023-11-15T18:11:23Z) - Does Deep Learning Learn to Abstract? A Systematic Probing Framework [69.2366890742283]
Abstraction is a desirable capability for deep learning models, which means to induce abstract concepts from concrete instances and flexibly apply them beyond the learning context.
We introduce a systematic probing framework to explore the abstraction capability of deep learning models from a transferability perspective.
arXiv Detail & Related papers (2023-02-23T12:50:02Z) - Robust and Controllable Object-Centric Learning through Energy-based
Models [95.68748828339059]
ours is a conceptually simple and general approach to learning object-centric representations through an energy-based model.
We show that ours can be easily integrated into existing architectures and can effectively extract high-quality object-centric representations.
arXiv Detail & Related papers (2022-10-11T15:11:15Z) - Abstract Interpretation for Generalized Heuristic Search in Model-Based
Planning [50.96320003643406]
Domain-general model-based planners often derive their generality by constructing searchs through the relaxation of symbolic world models.
We illustrate how abstract interpretation can serve as a unifying framework for these abstractions, extending the reach of search to richer world models.
Theses can also be integrated with learning, allowing agents to jumpstart planning in novel world models via abstraction-derived information.
arXiv Detail & Related papers (2022-08-05T00:22:11Z) - Constellation: Learning relational abstractions over objects for
compositional imagination [64.99658940906917]
We introduce Constellation, a network that learns relational abstractions of static visual scenes.
This work is a first step in the explicit representation of visual relationships and using them for complex cognitive procedures.
arXiv Detail & Related papers (2021-07-23T11:59:40Z) - Learning Neural-Symbolic Descriptive Planning Models via Cube-Space
Priors: The Voyage Home (to STRIPS) [13.141761152863868]
We show that our neuro-symbolic architecture is trained end-to-end to produce a succinct and effective discrete state transition model from images alone.
Our target representation is already in a form that off-the-shelf solvers can consume, and opens the door to the rich array of modern search capabilities.
arXiv Detail & Related papers (2020-04-27T15:01:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.