Learning Visual-Semantic Subspace Representations for Propositional Reasoning
- URL: http://arxiv.org/abs/2405.16213v1
- Date: Sat, 25 May 2024 12:51:38 GMT
- Title: Learning Visual-Semantic Subspace Representations for Propositional Reasoning
- Authors: Gabriel Moreira, Alexander Hauptmann, Manuel Marques, João Paulo Costeira,
- Abstract summary: We propose a novel approach for learning visual representations that conform to a specified semantic structure.
Our approach is based on a new nuclear norm-based loss.
We show that its minimum encodes the spectral geometry of the semantics in a subspace lattice.
- Score: 49.17165360280794
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning representations that capture rich semantic relationships and accommodate propositional calculus poses a significant challenge. Existing approaches are either contrastive, lacking theoretical guarantees, or fall short in effectively representing the partial orders inherent to rich visual-semantic hierarchies. In this paper, we propose a novel approach for learning visual representations that not only conform to a specified semantic structure but also facilitate probabilistic propositional reasoning. Our approach is based on a new nuclear norm-based loss. We show that its minimum encodes the spectral geometry of the semantics in a subspace lattice, where logical propositions can be represented by projection operators.
Related papers
- Exploring a Principled Framework for Deep Subspace Clustering [9.347670574036563]
We present a Principled fRamewOrk for Deep Subspace Clustering (PRO-DSC)
PRO-DSC is designed to learn structured representations and self-expressive coefficients in a unified manner.
We prove that the learned optimal representations under certain condition lie on a union of subspaces.
arXiv Detail & Related papers (2025-03-21T16:38:37Z) - Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts [68.48103545146127]
This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces.
We directly leverage natural language prompts and image captions to map latent directions.
Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models.
arXiv Detail & Related papers (2024-10-25T21:44:51Z) - Optimal synthesis embeddings [1.565361244756411]
We introduce a word embedding composition method based on the intuitive idea that a fair embedding representation for a given set of words should satisfy.
We show that our approach excels in solving probing tasks designed to capture simple linguistic features of sentences.
arXiv Detail & Related papers (2024-06-10T18:06:33Z) - Learning Discrete Concepts in Latent Hierarchical Models [73.01229236386148]
Learning concepts from natural high-dimensional data holds potential in building human-aligned and interpretable machine learning models.
We formalize concepts as discrete latent causal variables that are related via a hierarchical causal model.
We substantiate our theoretical claims with synthetic data experiments.
arXiv Detail & Related papers (2024-06-01T18:01:03Z) - Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training.
It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby.
It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z) - Neural Semantic Parsing with Extremely Rich Symbolic Meaning Representations [7.774674200374255]
We introduce a novel compositional symbolic representation for concepts based on their position in the taxonomical hierarchy.
This representation provides richer semantic information and enhances interpretability.
Our experimental findings demonstrate that the taxonomical model, trained on much richer and complex meaning representations, is slightly subordinate in performance to the traditional model using the standard metrics for evaluation, but outperforms it when dealing with out-of-vocabulary concepts.
arXiv Detail & Related papers (2024-04-19T08:06:01Z) - Discovering Abstract Symbolic Relations by Learning Unitary Group Representations [7.303827428956944]
We investigate a principled approach for symbolic operation completion (SOC)
SOC poses a unique challenge in modeling abstract relationships between discrete symbols.
We demonstrate that SOC can be efficiently solved by a minimal model - a bilinear map - with a novel factorized architecture.
arXiv Detail & Related papers (2024-02-26T20:18:43Z) - Grounded learning for compositional vector semantics [1.4344589271451351]
This work proposes a way for compositional distributional semantics to be implemented within a spiking neural network architecture.
We also describe a means of training word representations using labelled images.
arXiv Detail & Related papers (2024-01-10T22:12:34Z) - Provable Compositional Generalization for Object-Centric Learning [55.658215686626484]
Learning representations that generalize to novel compositions of known concepts is crucial for bridging the gap between human and machine perception.
We show that autoencoders that satisfy structural assumptions on the decoder and enforce encoder-decoder consistency will learn object-centric representations that provably generalize compositionally.
arXiv Detail & Related papers (2023-10-09T01:18:07Z) - LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and
Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge.
During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training.
These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z) - Towards Understanding the Mechanism of Contrastive Learning via
Similarity Structure: A Theoretical Analysis [10.29814984060018]
We consider a kernel-based contrastive learning framework termed Kernel Contrastive Learning (KCL)
We introduce a formulation of the similarity structure of learned representations by utilizing a statistical dependency viewpoint.
We show a new upper bound of the classification error of a downstream task, which explains that our theory is consistent with the empirical success of contrastive learning.
arXiv Detail & Related papers (2023-04-01T21:53:29Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Equivariant Representation Learning via Class-Pose Decomposition [17.032782230538388]
We introduce a general method for learning representations that are equivariant to symmetries of data.
The components semantically correspond to intrinsic data classes and poses respectively.
Results show that our representations capture the geometry of data and outperform other equivariant representation learning frameworks.
arXiv Detail & Related papers (2022-07-07T06:55:52Z) - PROTOtypical Logic Tensor Networks (PROTO-LTN) for Zero Shot Learning [2.236663830879273]
Logic Networks (LTNs) are neuro-symbolic systems based on a differentiable, first-order logic grounded into a deep neural network.
We focus here on the subsumption or textttisOfClass predicate, which is fundamental to encode most semantic image interpretation tasks.
We propose a common textttisOfClass predicate, whose level of truth is a function of the distance between an object embedding and the corresponding class prototype.
arXiv Detail & Related papers (2022-06-26T18:34:07Z) - Visual Superordinate Abstraction for Robust Concept Learning [80.15940996821541]
Concept learning constructs visual representations that are connected to linguistic semantics.
We ascribe the bottleneck to a failure of exploring the intrinsic semantic hierarchy of visual concepts.
We propose a visual superordinate abstraction framework for explicitly modeling semantic-aware visual subspaces.
arXiv Detail & Related papers (2022-05-28T14:27:38Z) - A Free Lunch from the Noise: Provable and Practical Exploration for
Representation Learning [55.048010996144036]
We show that under some noise assumption, we can obtain the linear spectral feature of its corresponding Markov transition operator in closed-form for free.
We propose Spectral Dynamics Embedding (SPEDE), which breaks the trade-off and completes optimistic exploration for representation learning by exploiting the structure of the noise.
arXiv Detail & Related papers (2021-11-22T19:24:57Z) - Thirty years of Epistemic Specifications [8.339560855135575]
We extend disjunctive logic programs under the stable model semantics with modal constructs called subjective literals.
Using subjective literals, it is possible to check whether a regular literal is true in every or some stable models of the program.
Several attempts for capturing the intuitions underlying the language by means of a formal semantics were given.
arXiv Detail & Related papers (2021-08-17T15:03:10Z) - The Low-Dimensional Linear Geometry of Contextualized Word
Representations [27.50785941238007]
We study the linear geometry of contextualized word representations in ELMO and BERT.
We show that a variety of linguistic features are encoded in low-dimensional subspaces.
arXiv Detail & Related papers (2021-05-15T00:58:08Z) - Deep Clustering by Semantic Contrastive Learning [67.28140787010447]
We introduce a novel variant called Semantic Contrastive Learning (SCL)
It explores the characteristics of both conventional contrastive learning and deep clustering.
It can amplify the strengths of contrastive learning and deep clustering in a unified approach.
arXiv Detail & Related papers (2021-03-03T20:20:48Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.