Adversarial Learning for Personalized Tag Recommendation
- URL: http://arxiv.org/abs/2004.00698v1
- Date: Wed, 1 Apr 2020 20:41:41 GMT
- Title: Adversarial Learning for Personalized Tag Recommendation
- Authors: Erik Quintanilla, Yogesh Rawat, Andrey Sakryukin, Mubarak Shah, Mohan
Kankanhalli
- Abstract summary: We propose an end-to-end deep network which can be trained on large-scale datasets.
A joint training of user-preference and visual encoding allows the network to efficiently integrate the visual preference with tagging behavior.
We demonstrate the effectiveness of the proposed model on two different large-scale and publicly available datasets.
- Score: 61.76193196463919
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We have recently seen great progress in image classification due to the
success of deep convolutional neural networks and the availability of
large-scale datasets. Most of the existing work focuses on single-label image
classification. However, there are usually multiple tags associated with an
image. The existing works on multi-label classification are mainly based on lab
curated labels. Humans assign tags to their images differently, which is mainly
based on their interests and personal tagging behavior. In this paper, we
address the problem of personalized tag recommendation and propose an
end-to-end deep network which can be trained on large-scale datasets. The
user-preference is learned within the network in an unsupervised way where the
network performs joint optimization for user-preference and visual encoding. A
joint training of user-preference and visual encoding allows the network to
efficiently integrate the visual preference with tagging behavior for a better
user recommendation. In addition, we propose the use of adversarial learning,
which enforces the network to predict tags resembling user-generated tags. We
demonstrate the effectiveness of the proposed model on two different
large-scale and publicly available datasets, YFCC100M and NUS-WIDE. The
proposed method achieves significantly better performance on both the datasets
when compared to the baselines and other state-of-the-art methods. The code is
publicly available at https://github.com/vyzuer/ALTReco.
Related papers
- Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Improving Model Training via Self-learned Label Representations [5.969349640156469]
We show that more sophisticated label representations are better for classification than the usual one-hot encoding.
We propose Learning with Adaptive Labels (LwAL) algorithm, which simultaneously learns the label representation while training for the classification task.
Our algorithm introduces negligible additional parameters and has a minimal computational overhead.
arXiv Detail & Related papers (2022-09-09T21:10:43Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Learning to Focus: Cascaded Feature Matching Network for Few-shot Image
Recognition [38.49419948988415]
Deep networks can learn to accurately recognize objects of a category by training on a large number of images.
A meta-learning challenge known as a low-shot image recognition task comes when only a few images with annotations are available for learning a recognition model for one category.
Our method, called Cascaded Feature Matching Network (CFMN), is proposed to solve this problem.
Experiments for few-shot learning on two standard datasets, emphminiImageNet and Omniglot, have confirmed the effectiveness of our method.
arXiv Detail & Related papers (2021-01-13T11:37:28Z) - Knowledge-Guided Multi-Label Few-Shot Learning for General Image
Recognition [75.44233392355711]
KGGR framework exploits prior knowledge of statistical label correlations with deep neural networks.
It first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence.
Then, it introduces the label semantics to guide learning semantic-specific features.
It exploits a graph propagation network to explore graph node interactions.
arXiv Detail & Related papers (2020-09-20T15:05:29Z) - Multi-label Zero-shot Classification by Learning to Transfer from
External Knowledge [36.04579549557464]
Multi-label zero-shot classification aims to predict multiple unseen class labels for an input image.
This paper introduces a novel multi-label zero-shot classification framework by learning to transfer from external knowledge.
arXiv Detail & Related papers (2020-07-30T17:26:46Z) - RGB-based Semantic Segmentation Using Self-Supervised Depth Pre-Training [77.62171090230986]
We propose an easily scalable and self-supervised technique that can be used to pre-train any semantic RGB segmentation method.
In particular, our pre-training approach makes use of automatically generated labels that can be obtained using depth sensors.
We show how our proposed self-supervised pre-training with HN-labels can be used to replace ImageNet pre-training.
arXiv Detail & Related papers (2020-02-06T11:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.