Structured (De)composable Representations Trained with Neural Networks
- URL: http://arxiv.org/abs/2007.03325v1
- Date: Tue, 7 Jul 2020 10:20:31 GMT
- Title: Structured (De)composable Representations Trained with Neural Networks
- Authors: Graham Spinks, Marie-Francine Moens
- Abstract summary: A template representation refers to the generic representation that captures the characteristics of an entire class.
The proposed technique uses end-to-end deep learning to learn structured and composable representations from input images and discrete labels.
We prove that the representations have a clear structure allowing to decompose the representation into factors that represent classes and environments.
- Score: 21.198279941828112
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The paper proposes a novel technique for representing templates and instances
of concept classes. A template representation refers to the generic
representation that captures the characteristics of an entire class. The
proposed technique uses end-to-end deep learning to learn structured and
composable representations from input images and discrete labels. The obtained
representations are based on distance estimates between the distributions given
by the class label and those given by contextual information, which are modeled
as environments. We prove that the representations have a clear structure
allowing to decompose the representation into factors that represent classes
and environments. We evaluate our novel technique on classification and
retrieval tasks involving different modalities (visual and language data).
Related papers
- Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - Leveraging Open-Vocabulary Diffusion to Camouflaged Instance
Segmentation [59.78520153338878]
Text-to-image diffusion techniques have shown exceptional capability of producing high-quality images from text descriptions.
We propose a method built upon a state-of-the-art diffusion model, empowered by open-vocabulary to learn multi-scale textual-visual features for camouflaged object representations.
arXiv Detail & Related papers (2023-12-29T07:59:07Z) - Neural Representations Reveal Distinct Modes of Class Fitting in
Residual Convolutional Networks [5.1271832547387115]
We leverage probabilistic models of neural representations to investigate how residual networks fit classes.
We find that classes in the investigated models are not fitted in an uniform way.
We show that the uncovered structure in neural representations correlate with robustness of training examples and adversarial memorization.
arXiv Detail & Related papers (2022-12-01T18:55:58Z) - Not All Instances Contribute Equally: Instance-adaptive Class
Representation Learning for Few-Shot Visual Recognition [94.04041301504567]
Few-shot visual recognition refers to recognize novel visual concepts from a few labeled instances.
We propose a novel metric-based meta-learning framework termed instance-adaptive class representation learning network (ICRL-Net) for few-shot visual recognition.
arXiv Detail & Related papers (2022-09-07T10:00:18Z) - Cross-Modal Discrete Representation Learning [73.68393416984618]
We present a self-supervised learning framework that learns a representation that captures finer levels of granularity across different modalities.
Our framework relies on a discretized embedding space created via vector quantization that is shared across different modalities.
arXiv Detail & Related papers (2021-06-10T00:23:33Z) - DirectProbe: Studying Representations without Classifiers [21.23284793831221]
DirectProbe studies the geometry of a representation by building upon the notion of a version space for a task.
Experiments with several linguistic tasks and contextualized embeddings show that, even without training classifiers, DirectProbe can shine light into how an embedding space represents labels.
arXiv Detail & Related papers (2021-04-13T02:40:26Z) - High-dimensional distributed semantic spaces for utterances [0.2907403645801429]
This paper describes a model for high-dimensional representation for utterance and text level data.
It is based on a mathematically principled and behaviourally plausible approach to representing linguistic information.
The paper shows how the implemented model is able to represent a broad range of linguistic features in a common integral framework of fixed dimensionality.
arXiv Detail & Related papers (2021-04-01T12:09:47Z) - A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques.
We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - The Immersion of Directed Multi-graphs in Embedding Fields.
Generalisations [0.0]
This paper outlines a generalised model for representing hybrid-categorical, symbolic, perceptual-sensory and perceptual-latent data.
This variety of representation is currently used by various machine-learning models in computer vision, NLP/NLU.
It is achieved by endowing a directed relational-Typed Multi-Graph with at least some edge attributes which represent the embeddings from various latent spaces.
arXiv Detail & Related papers (2020-04-28T09:28:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.