Category Query Learning for Human-Object Interaction Classification
- URL: http://arxiv.org/abs/2303.14005v1
- Date: Fri, 24 Mar 2023 13:59:58 GMT
- Title: Category Query Learning for Human-Object Interaction Classification
- Authors: Chi Xie, Fangao Zeng, Yue Hu, Shuang Liang and Yichen Wei
- Abstract summary: Unlike most previous HOI methods, we propose a novel and complementary approach called category query learning.
This idea is motivated by an earlier multi-label image classification method, but is for the first time applied for the challenging human-object interaction classification task.
Our method is simple, general and effective. It is validated on three representative HOI baselines and achieves new state-of-the-art results on two benchmarks.
- Score: 25.979131884959923
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Unlike most previous HOI methods that focus on learning better human-object
features, we propose a novel and complementary approach called category query
learning. Such queries are explicitly associated to interaction categories,
converted to image specific category representation via a transformer decoder,
and learnt via an auxiliary image-level classification task. This idea is
motivated by an earlier multi-label image classification method, but is for the
first time applied for the challenging human-object interaction classification
task. Our method is simple, general and effective. It is validated on three
representative HOI baselines and achieves new state-of-the-art results on two
benchmarks.
Related papers
- Preview-based Category Contrastive Learning for Knowledge Distillation [53.551002781828146]
We propose a novel preview-based category contrastive learning method for knowledge distillation (PCKD)
It first distills the structural knowledge of both instance-level feature correspondence and the relation between instance features and category centers.
It can explicitly optimize the category representation and explore the distinct correlation between representations of instances and categories.
arXiv Detail & Related papers (2024-10-18T03:31:00Z) - A Study on Representation Transfer for Few-Shot Learning [5.717951523323085]
Few-shot classification aims to learn to classify new object categories well using only a few labeled examples.
In this work we perform a systematic study of various feature representations for few-shot classification.
We find that learning from more complex tasks tend to give better representations for few-shot classification.
arXiv Detail & Related papers (2022-09-05T17:56:02Z) - Automatically Discovering Novel Visual Categories with Self-supervised
Prototype Learning [68.63910949916209]
This paper tackles the problem of novel category discovery (NCD), which aims to discriminate unknown categories in large-scale image collections.
We propose a novel adaptive prototype learning method consisting of two main stages: prototypical representation learning and prototypical self-training.
We conduct extensive experiments on four benchmark datasets and demonstrate the effectiveness and robustness of the proposed method with state-of-the-art performance.
arXiv Detail & Related papers (2022-08-01T16:34:33Z) - Comparison Knowledge Translation for Generalizable Image Classification [31.530232003512957]
We build a generalizable framework that emulates the humans' recognition mechanism in the image classification task.
We put forward a Comparison Classification Translation Network (CCT-Net), which comprises a comparison classifier and a matching discriminator.
CCT-Net achieves surprising generalization ability on unseen categories and SOTA performance on target categories.
arXiv Detail & Related papers (2022-05-07T11:05:18Z) - Semantic Representation and Dependency Learning for Multi-Label Image
Recognition [76.52120002993728]
We propose a novel and effective semantic representation and dependency learning (SRDL) framework to learn category-specific semantic representation for each category.
Specifically, we design a category-specific attentional regions (CAR) module to generate channel/spatial-wise attention matrices to guide model.
We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions.
arXiv Detail & Related papers (2022-04-08T00:55:15Z) - The Overlooked Classifier in Human-Object Interaction Recognition [82.20671129356037]
We encode the semantic correlation among classes into the classification head by initializing the weights with language embeddings of HOIs.
We propose a new loss named LSE-Sign to enhance multi-label learning on a long-tailed dataset.
Our simple yet effective method enables detection-free HOI classification, outperforming the state-of-the-arts that require object detection and human pose by a clear margin.
arXiv Detail & Related papers (2022-03-10T23:35:00Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Deep Metric Learning for Few-Shot Image Classification: A Selective
Review [38.71276383292809]
Few-shot image classification is a challenging problem which aims to achieve the human level of recognition based only on a small number of images.
Deep learning algorithms such as meta-learning, transfer learning, and metric learning have been employed recently and achieved the state-of-the-art performance.
arXiv Detail & Related papers (2021-05-17T20:27:59Z) - Progressive Co-Attention Network for Fine-grained Visual Classification [20.838908090777885]
Fine-grained visual classification aims to recognize images belonging to multiple sub-categories within a same category.
Most existing methods only take individual image as input.
We propose an effective method called progressive co-attention network (PCA-Net) to tackle this problem.
arXiv Detail & Related papers (2021-01-21T10:19:02Z) - Learning and Evaluating Representations for Deep One-class
Classification [59.095144932794646]
We present a two-stage framework for deep one-class classification.
We first learn self-supervised representations from one-class data, and then build one-class classifiers on learned representations.
In experiments, we demonstrate state-of-the-art performance on visual domain one-class classification benchmarks.
arXiv Detail & Related papers (2020-11-04T23:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.