Related papers: Emergence and Function of Abstract Representations in Self-Supervised Transformers

Emergence and Function of Abstract Representations in Self-Supervised Transformers

URL: http://arxiv.org/abs/2312.05361v1
Date: Fri, 8 Dec 2023 20:47:15 GMT
Title: Emergence and Function of Abstract Representations in Self-Supervised Transformers
Authors: Quentin RV. Ferry, Joshua Ching, Takashi Kawai
Abstract summary: We study the inner workings of small-scale transformers trained to reconstruct partially masked visual scenes. We show that the network develops intermediate abstract representations, or abstractions, that encode all semantic features of the dataset. Using precise manipulation experiments, we demonstrate that abstractions are central to the network's decision-making process.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Human intelligence relies in part on our brains' ability to create abstract mental models that succinctly capture the hidden blueprint of our reality. Such abstract world models notably allow us to rapidly navigate novel situations by generalizing prior knowledge, a trait deep learning systems have historically struggled to replicate. However, the recent shift from supervised to self-supervised objectives, combined with expressive transformer-based architectures, have yielded powerful foundation models that appear to learn versatile representations that can support a wide range of downstream tasks. This promising development raises the intriguing possibility of such models developing in silico abstract world models. We test this hypothesis by studying the inner workings of small-scale transformers trained to reconstruct partially masked visual scenes generated from a simple blueprint. We show that the network develops intermediate abstract representations, or abstractions, that encode all semantic features of the dataset. These abstractions manifest as low-dimensional manifolds where the embeddings of semantically related tokens transiently converge, thus allowing for the generalization of downstream computations. Using precise manipulation experiments, we demonstrate that abstractions are central to the network's decision-making process. Our research also suggests that these abstractions are compositionally structured, exhibiting features like contextual independence and part-whole relationships that mirror the compositional nature of the dataset. Finally, we introduce a Language-Enhanced Architecture (LEA) designed to encourage the network to articulate its computations. We find that LEA develops an abstraction-centric language that can be easily interpreted, allowing us to more readily access and steer the network's decision-making process.

Related papers

Seeing the Abstract: Translating the Abstract Language for Vision Language Models [13.065703240655973]
We focus our investigation on the fashion domain, a highly-representative field with abstract expressions.<n>By analyzing recent large-scale multimodal fashion datasets, we find that abstract terms have a dominant presence.<n>We propose a training-free and model-agnostic method, Abstract-to-Concrete Translator (ACT), to shift abstract representations towards well-represented concrete ones.
arXiv Detail & Related papers (2025-05-06T07:14:10Z)
VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning [86.59849798539312]
We present Neuro-Symbolic Predicates, a first-order abstraction language that combines the strengths of symbolic and neural knowledge representations. We show that our approach offers better sample complexity, stronger out-of-distribution generalization, and improved interpretability.
arXiv Detail & Related papers (2024-10-30T16:11:05Z)
ARPA: A Novel Hybrid Model for Advancing Visual Word Disambiguation Using Large Language Models and Transformers [1.6541870997607049]
We present ARPA, an architecture that fuses the unparalleled contextual understanding of large language models with the advanced feature extraction capabilities of transformers. ARPA's introduction marks a significant milestone in visual word disambiguation, offering a compelling solution. We invite researchers and practitioners to explore the capabilities of our model, envisioning a future where such hybrid models drive unprecedented advancements in artificial intelligence.
arXiv Detail & Related papers (2024-08-12T10:15:13Z)
AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph [62.685920585838616]
abstraction ability is essential in human intelligence, which remains under-explored in language models. We present AbsPyramid, a unified entailment graph of 221K textual descriptions of abstraction knowledge.
arXiv Detail & Related papers (2023-11-15T18:11:23Z)
Does Deep Learning Learn to Abstract? A Systematic Probing Framework [69.2366890742283]
Abstraction is a desirable capability for deep learning models, which means to induce abstract concepts from concrete instances and flexibly apply them beyond the learning context. We introduce a systematic probing framework to explore the abstraction capability of deep learning models from a transferability perspective.
arXiv Detail & Related papers (2023-02-23T12:50:02Z)
Robust and Controllable Object-Centric Learning through Energy-based Models [95.68748828339059]
ours is a conceptually simple and general approach to learning object-centric representations through an energy-based model. We show that ours can be easily integrated into existing architectures and can effectively extract high-quality object-centric representations.
arXiv Detail & Related papers (2022-10-11T15:11:15Z)
Abstract Interpretation for Generalized Heuristic Search in Model-Based Planning [50.96320003643406]
Domain-general model-based planners often derive their generality by constructing searchs through the relaxation of symbolic world models. We illustrate how abstract interpretation can serve as a unifying framework for these abstractions, extending the reach of search to richer world models. Theses can also be integrated with learning, allowing agents to jumpstart planning in novel world models via abstraction-derived information.
arXiv Detail & Related papers (2022-08-05T00:22:11Z)
Constellation: Learning relational abstractions over objects for compositional imagination [64.99658940906917]
We introduce Constellation, a network that learns relational abstractions of static visual scenes. This work is a first step in the explicit representation of visual relationships and using them for complex cognitive procedures.
arXiv Detail & Related papers (2021-07-23T11:59:40Z)
Learning Neural-Symbolic Descriptive Planning Models via Cube-Space Priors: The Voyage Home (to STRIPS) [13.141761152863868]
We show that our neuro-symbolic architecture is trained end-to-end to produce a succinct and effective discrete state transition model from images alone. Our target representation is already in a form that off-the-shelf solvers can consume, and opens the door to the rich array of modern search capabilities.
arXiv Detail & Related papers (2020-04-27T15:01:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.