Symbolic Disentangled Representations for Images
- URL: http://arxiv.org/abs/2412.19847v1
- Date: Wed, 25 Dec 2024 09:20:13 GMT
- Title: Symbolic Disentangled Representations for Images
- Authors: Alexandr Korchemnyi, Alexey K. Kovalev, Aleksandr I. Panov,
- Abstract summary: We propose ArSyD (Architecture for Disentanglement), which represents each generative factor as a vector of the same dimension as the resulting representation.
We study ArSyD on the dSprites and CLEVR datasets and provide a comprehensive analysis of the learned symbolic disentangled representations.
- Score: 83.88591755871734
- License:
- Abstract: The idea of disentangled representations is to reduce the data to a set of generative factors that produce it. Typically, such representations are vectors in latent space, where each coordinate corresponds to one of the generative factors. The object can then be modified by changing the value of a particular coordinate, but it is necessary to determine which coordinate corresponds to the desired generative factor -- a difficult task if the vector representation has a high dimension. In this article, we propose ArSyD (Architecture for Symbolic Disentanglement), which represents each generative factor as a vector of the same dimension as the resulting representation. In ArSyD, the object representation is obtained as a superposition of the generative factor vector representations. We call such a representation a \textit{symbolic disentangled representation}. We use the principles of Hyperdimensional Computing (also known as Vector Symbolic Architectures), where symbols are represented as hypervectors, allowing vector operations on them. Disentanglement is achieved by construction, no additional assumptions about the underlying distributions are made during training, and the model is only trained to reconstruct images in a weakly supervised manner. We study ArSyD on the dSprites and CLEVR datasets and provide a comprehensive analysis of the learned symbolic disentangled representations. We also propose new disentanglement metrics that allow comparison of methods using latent representations of different dimensions. ArSyD allows to edit the object properties in a controlled and interpretable way, and the dimensionality of the object property representation coincides with the dimensionality of the object representation itself.
Related papers
- An Intrinsic Vector Heat Network [64.55434397799728]
This paper introduces a novel neural network architecture for learning tangent vector fields embedded in 3D.
We introduce a trainable vector heat diffusion module to spatially propagate vector-valued feature data across the surface.
We also demonstrate the effectiveness of our method on the useful industrial application of quadrilateral mesh generation.
arXiv Detail & Related papers (2024-06-14T00:40:31Z) - AdaContour: Adaptive Contour Descriptor with Hierarchical Representation [52.381359663689004]
Existing angle-based contour descriptors suffer from lossy representation for non-star shapes.
AdaCon is able to represent shapes more accurately robustly than other descriptors.
arXiv Detail & Related papers (2024-04-12T07:30:24Z) - Object-centric architectures enable efficient causal representation
learning [51.6196391784561]
We show that when the observations are of multiple objects, the generative function is no longer injective and disentanglement fails in practice.
We develop an object-centric architecture that leverages weak supervision from sparse perturbations to disentangle each object's properties.
This approach is more data-efficient in the sense that it requires significantly fewer perturbations than a comparable approach that encodes to a Euclidean space.
arXiv Detail & Related papers (2023-10-29T16:01:03Z) - Neural Constraint Satisfaction: Hierarchical Abstraction for
Combinatorial Generalization in Object Rearrangement [75.9289887536165]
We present a hierarchical abstraction approach to uncover underlying entities.
We show how to learn a correspondence between intervening on states of entities in the agent's model and acting on objects in the environment.
We use this correspondence to develop a method for control that generalizes to different numbers and configurations of objects.
arXiv Detail & Related papers (2023-03-20T18:19:36Z) - Recursive Binding for Similarity-Preserving Hypervector Representations
of Sequences [4.65149292714414]
A critical step for designing the HDC/VSA solutions is to obtain such representations from the input data.
Here, we propose their transformation to distributed representations that both preserve the similarity of identical sequence elements at nearby positions and are equivariant to the sequence shift.
The proposed transformation was experimentally investigated with symbolic strings used for modeling human perception of word similarity.
arXiv Detail & Related papers (2022-01-27T17:41:28Z) - Shift-Equivariant Similarity-Preserving Hypervector Representations of
Sequences [0.8223798883838329]
We propose an approach for the formation of hypervectors of sequences.
Our methods represent the sequence elements by compositional hypervectors.
We experimentally explored the proposed representations using a diverse set of tasks with data in the form of symbolic strings.
arXiv Detail & Related papers (2021-12-31T14:29:12Z) - Computing on Functions Using Randomized Vector Representations [4.066849397181077]
We call this new function encoding and computing framework Vector Function Architecture (VFA)
Our analyses and results suggest that VFAs constitute a powerful new framework for representing and manipulating functions in distributed neural systems.
arXiv Detail & Related papers (2021-09-08T04:39:48Z) - Elliptical Ordinal Embedding [0.0]
Ordinal embedding aims at finding a low dimensional representation of objects from a set of constraints of the form "item $j$ is closer to item $i$ than item $k$"
arXiv Detail & Related papers (2021-05-21T16:54:53Z) - Invariant Deep Compressible Covariance Pooling for Aerial Scene
Categorization [80.55951673479237]
We propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization.
We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-11-11T11:13:07Z) - Analyzing the Capacity of Distributed Vector Representations to Encode
Spatial Information [0.0]
We focus on simple superposition and more complex, structured representations involving convolutive powers to encode spatial information.
In two experiments, we find upper bounds for the number of concepts that can effectively be stored in a single vector.
arXiv Detail & Related papers (2020-09-30T18:49:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.