Learning Visual-Semantic Subspace Representations for Propositional Reasoning
- URL: http://arxiv.org/abs/2405.16213v1
- Date: Sat, 25 May 2024 12:51:38 GMT
- Title: Learning Visual-Semantic Subspace Representations for Propositional Reasoning
- Authors: Gabriel Moreira, Alexander Hauptmann, Manuel Marques, João Paulo Costeira,
- Abstract summary: We propose a novel approach for learning visual representations that conform to a specified semantic structure.
Our approach is based on a new nuclear norm-based loss.
We show that its minimum encodes the spectral geometry of the semantics in a subspace lattice.
- Score: 49.17165360280794
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning representations that capture rich semantic relationships and accommodate propositional calculus poses a significant challenge. Existing approaches are either contrastive, lacking theoretical guarantees, or fall short in effectively representing the partial orders inherent to rich visual-semantic hierarchies. In this paper, we propose a novel approach for learning visual representations that not only conform to a specified semantic structure but also facilitate probabilistic propositional reasoning. Our approach is based on a new nuclear norm-based loss. We show that its minimum encodes the spectral geometry of the semantics in a subspace lattice, where logical propositions can be represented by projection operators.
Related papers
- Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts [68.48103545146127]
This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces.
We directly leverage natural language prompts and image captions to map latent directions.
Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models.
arXiv Detail & Related papers (2024-10-25T21:44:51Z) - Optimal synthesis embeddings [1.565361244756411]
We introduce a word embedding composition method based on the intuitive idea that a fair embedding representation for a given set of words should satisfy.
We show that our approach excels in solving probing tasks designed to capture simple linguistic features of sentences.
arXiv Detail & Related papers (2024-06-10T18:06:33Z) - Neural Semantic Parsing with Extremely Rich Symbolic Meaning Representations [7.774674200374255]
We introduce a novel compositional symbolic representation for concepts based on their position in the taxonomical hierarchy.
This representation provides richer semantic information and enhances interpretability.
Our experimental findings demonstrate that the taxonomical model, trained on much richer and complex meaning representations, is slightly subordinate in performance to the traditional model using the standard metrics for evaluation, but outperforms it when dealing with out-of-vocabulary concepts.
arXiv Detail & Related papers (2024-04-19T08:06:01Z) - Grounded learning for compositional vector semantics [1.4344589271451351]
This work proposes a way for compositional distributional semantics to be implemented within a spiking neural network architecture.
We also describe a means of training word representations using labelled images.
arXiv Detail & Related papers (2024-01-10T22:12:34Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - PROTOtypical Logic Tensor Networks (PROTO-LTN) for Zero Shot Learning [2.236663830879273]
Logic Networks (LTNs) are neuro-symbolic systems based on a differentiable, first-order logic grounded into a deep neural network.
We focus here on the subsumption or textttisOfClass predicate, which is fundamental to encode most semantic image interpretation tasks.
We propose a common textttisOfClass predicate, whose level of truth is a function of the distance between an object embedding and the corresponding class prototype.
arXiv Detail & Related papers (2022-06-26T18:34:07Z) - Visual Superordinate Abstraction for Robust Concept Learning [80.15940996821541]
Concept learning constructs visual representations that are connected to linguistic semantics.
We ascribe the bottleneck to a failure of exploring the intrinsic semantic hierarchy of visual concepts.
We propose a visual superordinate abstraction framework for explicitly modeling semantic-aware visual subspaces.
arXiv Detail & Related papers (2022-05-28T14:27:38Z) - Thirty years of Epistemic Specifications [8.339560855135575]
We extend disjunctive logic programs under the stable model semantics with modal constructs called subjective literals.
Using subjective literals, it is possible to check whether a regular literal is true in every or some stable models of the program.
Several attempts for capturing the intuitions underlying the language by means of a formal semantics were given.
arXiv Detail & Related papers (2021-08-17T15:03:10Z) - Deep Clustering by Semantic Contrastive Learning [67.28140787010447]
We introduce a novel variant called Semantic Contrastive Learning (SCL)
It explores the characteristics of both conventional contrastive learning and deep clustering.
It can amplify the strengths of contrastive learning and deep clustering in a unified approach.
arXiv Detail & Related papers (2021-03-03T20:20:48Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.