The pursuit of beauty: Converting image labels to meaningful vectors
- URL: http://arxiv.org/abs/2008.00665v1
- Date: Mon, 3 Aug 2020 06:33:11 GMT
- Title: The pursuit of beauty: Converting image labels to meaningful vectors
- Authors: Savvas Karatsiolis and Andreas Kamilaris
- Abstract summary: This paper introduces a method, called Occlusion-based Latent Representations (OLR), for converting image labels to meaningful representations that capture a significant amount of data semantics.
Besides being informational rich, these representations compose a disentangled low-dimensional latent space where each image label is encoded into a separate vector.
We evaluate the quality of these representations in a series of experiments whose results suggest that the proposed model can capture data concepts and discover data interrelations.
- Score: 2.741266294612776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A challenge of the computer vision community is to understand the semantics
of an image, in order to allow image reconstruction based on existing
high-level features or to better analyze (semi-)labelled datasets. Towards
addressing this challenge, this paper introduces a method, called
Occlusion-based Latent Representations (OLR), for converting image labels to
meaningful representations that capture a significant amount of data semantics.
Besides being informational rich, these representations compose a disentangled
low-dimensional latent space where each image label is encoded into a separate
vector. We evaluate the quality of these representations in a series of
experiments whose results suggest that the proposed model can capture data
concepts and discover data interrelations.
Related papers
- Wavelet-based Unsupervised Label-to-Image Translation [9.339522647331334]
We propose a new Unsupervised paradigm for SIS (USIS) that makes use of a self-supervised segmentation loss and whole image wavelet based discrimination.
We test our methodology on 3 challenging datasets and demonstrate its ability to bridge the performance gap between paired and unpaired models.
arXiv Detail & Related papers (2023-05-16T17:48:44Z) - Dual-Perspective Semantic-Aware Representation Blending for Multi-Label
Image Recognition with Partial Labels [70.36722026729859]
We propose a dual-perspective semantic-aware representation blending (DSRB) that blends multi-granularity category-specific semantic representation across different images.
The proposed DS consistently outperforms current state-of-the-art algorithms on all proportion label settings.
arXiv Detail & Related papers (2022-05-26T00:33:44Z) - Graph Attention Transformer Network for Multi-Label Image Classification [50.0297353509294]
We propose a general framework for multi-label image classification that can effectively mine complex inter-label relationships.
Our proposed methods can achieve state-of-the-art performance on three datasets.
arXiv Detail & Related papers (2022-03-08T12:39:05Z) - Region-level Active Learning for Cluttered Scenes [60.93811392293329]
We introduce a new strategy that subsumes previous Image-level and Object-level approaches into a generalized, Region-level approach.
We show that this approach significantly decreases labeling effort and improves rare object search on realistic data with inherent class-imbalance and cluttered scenes.
arXiv Detail & Related papers (2021-08-20T14:02:38Z) - Multi-layered Semantic Representation Network for Multi-label Image
Classification [8.17894017454724]
Multi-label image classification (MLIC) is a fundamental and practical task, which aims to assign multiple possible labels to an image.
In recent years, many deep convolutional neural network (CNN) based approaches have been proposed which model label correlations.
This paper advances this research direction by improving the modeling of label correlations and the learning of semantic representations.
arXiv Detail & Related papers (2021-06-22T08:04:22Z) - Semantic Diversity Learning for Zero-Shot Multi-label Classification [14.480713752871523]
This study introduces an end-to-end model training for multi-label zero-shot learning.
We propose to use an embedding matrix having principal embedding vectors trained using a tailored loss function.
In addition, during training, we suggest up-weighting in the loss function image samples presenting higher semantic diversity to encourage the diversity of the embedding matrix.
arXiv Detail & Related papers (2021-05-12T19:39:07Z) - General Multi-label Image Classification with Transformers [30.58248625606648]
We propose the Classification Transformer (C-Tran) to exploit the complex dependencies among visual features and labels.
A key ingredient of our method is a label mask training objective that uses a ternary encoding scheme to represent the state of the labels.
Our model shows state-of-the-art performance on challenging datasets such as COCO and Visual Genome.
arXiv Detail & Related papers (2020-11-27T23:20:35Z) - Using Text to Teach Image Retrieval [47.72498265721957]
We build on the concept of image manifold to represent the feature space of images, learned via neural networks, as a graph.
We augment the manifold samples with geometrically aligned text, thereby using a plethora of sentences to teach us about images.
The experimental results show that the joint embedding manifold is a robust representation, allowing it to be a better basis to perform image retrieval.
arXiv Detail & Related papers (2020-11-19T16:09:14Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z) - Learning Representations by Predicting Bags of Visual Words [55.332200948110895]
Self-supervised representation learning targets to learn convnet-based image representations from unlabeled data.
Inspired by the success of NLP methods in this area, in this work we propose a self-supervised approach based on spatially dense image descriptions.
arXiv Detail & Related papers (2020-02-27T16:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.