Related papers: Label2Label: A Language Modeling Framework for Multi-Attribute Learning

Label2Label: A Language Modeling Framework for Multi-Attribute Learning

URL: http://arxiv.org/abs/2207.08677v1
Date: Mon, 18 Jul 2022 15:12:33 GMT
Title: Label2Label: A Language Modeling Framework for Multi-Attribute Learning
Authors: Wanhua Li, Zhexuan Cao, Jianjiang Feng, Jie Zhou, Jiwen Lu
Abstract summary: Label2Label is the first attempt for multi-attribute prediction from the perspective of language modeling. Inspired by the success of pre-training language models in NLP, Label2Label introduces an image-conditioned masked language model. Our intuition is that the instance-wise attribute relations are well grasped if the neural net can infer the missing attributes based on the context and the remaining attribute hints.
Score: 93.68058298766739
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Objects are usually associated with multiple attributes, and these attributes often exhibit high correlations. Modeling complex relationships between attributes poses a great challenge for multi-attribute learning. This paper proposes a simple yet generic framework named Label2Label to exploit the complex attribute correlations. Label2Label is the first attempt for multi-attribute prediction from the perspective of language modeling. Specifically, it treats each attribute label as a "word" describing the sample. As each sample is annotated with multiple attribute labels, these "words" will naturally form an unordered but meaningful "sentence", which depicts the semantic information of the corresponding sample. Inspired by the remarkable success of pre-training language models in NLP, Label2Label introduces an image-conditioned masked language model, which randomly masks some of the "word" tokens from the label "sentence" and aims to recover them based on the masked "sentence" and the context conveyed by image features. Our intuition is that the instance-wise attribute relations are well grasped if the neural net can infer the missing attributes based on the context and the remaining attribute hints. Label2Label is conceptually simple and empirically powerful. Without incorporating task-specific prior knowledge and highly specialized network designs, our approach achieves state-of-the-art results on three different multi-attribute learning tasks, compared to highly customized domain-specific methods. Code is available at https://github.com/Li-Wanhua/Label2Label.

Related papers

LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification [63.07563443280147]
We propose a novel framework named LATex for AG-ReID. It adopts prompt-tuning strategies to leverage attribute-based text knowledge. Our framework can fully leverage attribute-based text knowledge to improve the AG-ReID.
arXiv Detail & Related papers (2025-03-31T04:47:05Z)
Adaptive Prototype Model for Attribute-based Multi-label Few-shot Action Recognition [11.316708754749103]
In real-world action recognition systems, incorporating more attributes helps achieve a more comprehensive understanding of human behavior. We propose a novel method i.e. Adaptive Attribute Prototype Model (AAPM) for human action recognition, which captures rich action-relevant attribute information. Our AAPM achieves the state-of-the-art performance in both attribute-based multi-label few-shot action recognition and single-label few-shot action recognition.
arXiv Detail & Related papers (2025-02-18T06:39:28Z)
AE-smnsMLC: Multi-Label Classification with Semantic Matching and Negative Label Sampling for Product Attribute Value Extraction [42.79022954630978]
Product attribute value extraction plays an important role for many real-world applications in e-Commerce such as product search and recommendation. Previous methods treat it as a sequence labeling task that needs more annotation for position of values in the product text. We propose a classification model with semantic matching and negative label sampling for attribute value extraction.
arXiv Detail & Related papers (2023-10-11T02:22:28Z)
Description-Enhanced Label Embedding Contrastive Learning for Text Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task. Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets. external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z)
POAR: Towards Open Vocabulary Pedestrian Attribute Recognition [39.399286703315745]
Pedestrian attribute recognition (PAR) aims to predict the attributes of a target pedestrian in a surveillance system. It is impossible to exhaust all pedestrian attributes in the real world. We develop a novel pedestrian open-attribute recognition framework.
arXiv Detail & Related papers (2023-03-26T06:59:23Z)
Label Semantics for Few Shot Named Entity Recognition [68.01364012546402]
We study the problem of few shot learning for named entity recognition. We leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors. Our model learns to match the representations of named entities computed by the first encoder with label representations computed by the second encoder.
arXiv Detail & Related papers (2022-03-16T23:21:05Z)
Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels [86.17081952197788]
We propose to blend category-specific representation across different images to transfer information of known labels to complement unknown labels. Experiments on the MS-COCO, Visual Genome, Pascal VOC 2007 datasets show that the proposed SARB framework obtains superior performance over current leading competitors.
arXiv Detail & Related papers (2022-03-04T07:56:16Z)
Label Mask for Multi-Label Text Classification [6.742627397194543]
We propose a Label Mask multi-label text classification model (LM-MTC), which is inspired by the idea of cloze questions of language model. On the basis, we assign a different token to each potential label, and randomly mask the token with a certain probability to build a label based Masked Language Model (MLM)
arXiv Detail & Related papers (2021-06-18T11:54:33Z)
Low-Resource Task-Oriented Semantic Parsing via Intrinsic Modeling [65.51280121472146]
We exploit what we intrinsically know about ontology labels to build efficient semantic parsing models. Our model is highly efficient using a low-resource benchmark derived from TOPv2.
arXiv Detail & Related papers (2021-04-15T04:01:02Z)
Automatic Validation of Textual Attribute Values in E-commerce Catalog by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge. It can learn transferable knowledge from a subset of categories with limited labeled data. It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z)
Multi-Label Text Classification using Attention-based Graph Neural Network [0.0]
A graph attention network-based model is proposed to capture the attentive dependency structure among the labels. The proposed model achieves similar or better performance compared to the previous state-of-the-art models.
arXiv Detail & Related papers (2020-03-22T17:12:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.