Related papers: CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets

CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets

URL: http://arxiv.org/abs/2302.02551v3
Date: Wed, 31 May 2023 07:44:28 GMT
Title: CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
Authors: Zachary Novack, Julian McAuley, Zachary C. Lipton, Saurabh Garg
Abstract summary: Open vocabulary models (e.g. CLIP) have shown strong performance on zero-shot classification. We propose Classification with Hierarchical Label Sets (or CHiLS) for datasets with implicit semantic hierarchies. CHiLS is simple to implement within existing zero-shot pipelines and requires no additional training cost.
Score: 24.868024094095983
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Open vocabulary models (e.g. CLIP) have shown strong performance on zero-shot classification through their ability generate embeddings for each class based on their (natural language) names. Prior work has focused on improving the accuracy of these models through prompt engineering or by incorporating a small amount of labeled downstream data (via finetuning). However, there has been little focus on improving the richness of the class names themselves, which can pose issues when class labels are coarsely-defined and are uninformative. We propose Classification with Hierarchical Label Sets (or CHiLS), an alternative strategy for zero-shot classification specifically designed for datasets with implicit semantic hierarchies. CHiLS proceeds in three steps: (i) for each class, produce a set of subclasses, using either existing label hierarchies or by querying GPT-3; (ii) perform the standard zero-shot CLIP procedure as though these subclasses were the labels of interest; (iii) map the predicted subclass back to its parent to produce the final prediction. Across numerous datasets with underlying hierarchical structure, CHiLS leads to improved accuracy in situations both with and without ground-truth hierarchical information. CHiLS is simple to implement within existing zero-shot pipelines and requires no additional training cost. Code is available at: https://github.com/acmi-lab/CHILS.

Related papers

Lidar Panoptic Segmentation in an Open World [50.094491113541046]
Lidar Panoptics (LPS) is crucial for safe deployment of autonomous vehicles. LPS aims to recognize and segment lidar points wr.t. a pre-defined vocabulary of semantic classes. We propose a class-agnostic point clustering and over-segment the input cloud in a hierarchical fashion, followed by binary point segment classification.
arXiv Detail & Related papers (2024-09-22T00:10:20Z)
Enhancing Visual Continual Learning with Language-Guided Supervision [76.38481740848434]
Continual learning aims to empower models to learn new tasks without forgetting previously acquired knowledge. We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks. Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals.
arXiv Detail & Related papers (2024-03-24T12:41:58Z)
TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision [41.05874642535256]
Hierarchical text classification aims to categorize each document into a set of classes in a label taxonomy. Most earlier works focus on fully or semi-supervised methods that require a large amount of human annotated data. We work on hierarchical text classification with the minimal amount of supervision: using the sole class name of each node as the only supervision.
arXiv Detail & Related papers (2024-02-29T22:26:07Z)
TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP Without Training [29.431698321195814]
Contrastive Language-Image Pre-training (CLIP) has demonstrated impressive capabilities in open-vocabulary classification. CLIP shows poor performance on multi-label datasets because the global feature tends to be dominated by the most prominent class. We propose a local-to-global framework to obtain image tags.
arXiv Detail & Related papers (2023-12-20T08:15:40Z)
Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment [53.2701026843921]
Large-scale pre-trained Vision Language Models (VLMs) have proven effective for zero-shot classification. In this paper, we aim at a more challenging setting, Realistic Zero-Shot Classification, which assumes no annotation but instead a broad vocabulary. We propose the Self Structural Semantic Alignment (S3A) framework, which extracts structural semantic information from unlabeled data while simultaneously self-learning.
arXiv Detail & Related papers (2023-08-24T17:56:46Z)
ProTeCt: Prompt Tuning for Taxonomic Open Set Classification [59.59442518849203]
Few-shot adaptation methods do not fare well in the taxonomic open set (TOS) setting. We propose a prompt tuning technique that calibrates the hierarchical consistency of model predictions. A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities.
arXiv Detail & Related papers (2023-06-04T02:55:25Z)
Instance-level Few-shot Learning with Class Hierarchy Mining [26.273796311012042]
We exploit hierarchical information to leverage discriminative and relevant features of base classes to effectively classify novel objects. These features are extracted from abundant data of base classes, which could be utilized to reasonably describe classes with scarce data. In order to effectively train the hierarchy-based-detector in FSIS, we apply the label refinement to further describe the associations between fine-grained classes.
arXiv Detail & Related papers (2023-04-15T02:55:08Z)
Inducing a hierarchy for multi-class classification problems [11.58041597483471]
In applications where categorical labels follow a natural hierarchy, classification methods that exploit the label structure often outperform those that do not. In this paper, we investigate a class of methods that induce a hierarchy that can similarly improve classification performance over flat classifiers. We demonstrate the effectiveness of the class of methods both for discovering a latent hierarchy and for improving accuracy in principled simulation settings and three real data applications.
arXiv Detail & Related papers (2021-02-20T05:40:42Z)
An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels [49.036212158261215]
Large-scale Multi-label Text Classification (LMTC) has a wide range of Natural Language Processing (NLP) applications. Current state-of-the-art LMTC models employ Label-Wise Attention Networks (LWANs) We show that hierarchical methods based on Probabilistic Label Trees (PLTs) outperform LWANs. We propose a new state-of-the-art method which combines BERT with LWANs.
arXiv Detail & Related papers (2020-10-04T18:55:47Z)
Attribute Propagation Network for Graph Zero-shot Learning [57.68486382473194]
We introduce the attribute propagation network (APNet), which is composed of 1) a graph propagation model generating attribute vector for each class and 2) a parameterized nearest neighbor (NN) classifier. APNet achieves either compelling performance or new state-of-the-art results in experiments with two zero-shot learning settings and five benchmark datasets.
arXiv Detail & Related papers (2020-09-24T16:53:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.