In-Context Learning for Extreme Multi-Label Classification
- URL: http://arxiv.org/abs/2401.12178v1
- Date: Mon, 22 Jan 2024 18:09:52 GMT
- Title: In-Context Learning for Extreme Multi-Label Classification
- Authors: Karel D'Oosterlinck, Omar Khattab, Fran\c{c}ois Remy, Thomas
Demeester, Chris Develder, Christopher Potts
- Abstract summary: Multi-label classification problems with thousands of classes are hard to solve with in-context learning alone.
We propose a general program that defines multi-step interactions between LMs and retrievers to efficiently tackle such problems.
Our solution requires no finetuning, is easily applicable to new tasks, alleviates prompt engineering, and requires only tens of labeled examples.
- Score: 29.627891261947536
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Multi-label classification problems with thousands of classes are hard to
solve with in-context learning alone, as language models (LMs) might lack prior
knowledge about the precise classes or how to assign them, and it is generally
infeasible to demonstrate every class in a prompt. We propose a general
program, $\texttt{Infer--Retrieve--Rank}$, that defines multi-step interactions
between LMs and retrievers to efficiently tackle such problems. We implement
this program using the $\texttt{DSPy}$ programming model, which specifies
in-context systems in a declarative manner, and use $\texttt{DSPy}$ optimizers
to tune it towards specific datasets by bootstrapping only tens of few-shot
examples. Our primary extreme classification program, optimized separately for
each task, attains state-of-the-art results across three benchmarks (HOUSE,
TECH, TECHWOLF). We apply the same program to a benchmark with vastly different
characteristics and attain competitive performance as well (BioDEX). Unlike
prior work, our proposed solution requires no finetuning, is easily applicable
to new tasks, alleviates prompt engineering, and requires only tens of labeled
examples. Our code is public at https://github.com/KarelDO/xmc.dspy.
Related papers
- Adapting Vision-Language Models to Open Classes via Test-Time Prompt Tuning [50.26965628047682]
Adapting pre-trained models to open classes is a challenging problem in machine learning.
In this paper, we consider combining the advantages of both and come up with a test-time prompt tuning approach.
Our proposed method outperforms all comparison methods on average considering both base and new classes.
arXiv Detail & Related papers (2024-08-29T12:34:01Z) - Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts [10.262029691744921]
We present Label Anything, an innovative neural network architecture designed for few-shot semantic segmentation (FSS)
Label Anything demonstrates remarkable generalizability across multiple classes with minimal examples required per class.
Our comprehensive experimental validation, particularly achieving state-of-the-art results on the COCO-$20i$ benchmark, underscores Label Anything's robust generalization and flexibility.
arXiv Detail & Related papers (2024-07-02T09:08:06Z) - FastGAS: Fast Graph-based Annotation Selection for In-Context Learning [53.17606395275021]
In-context learning (ICL) empowers large language models (LLMs) to tackle new tasks by using a series of training instances as prompts.
Existing methods have proposed to select a subset of unlabeled examples for annotation.
We propose a graph-based selection method, FastGAS, designed to efficiently identify high-quality instances.
arXiv Detail & Related papers (2024-06-06T04:05:54Z) - Learning to Reason via Program Generation, Emulation, and Search [33.11955431589091]
Program synthesis with language models (LMs) has unlocked a large set of reasoning abilities.
Not all reasoning tasks are easily expressible as code, e.g. tasks involving commonsense reasoning, moral decision-making, and sarcasm understanding.
We propose Code Generation and Emulated EXecution (CoGEX) to extend an LM's program synthesis skills to such tasks.
arXiv Detail & Related papers (2024-05-25T19:40:50Z) - Less is more: Summarizing Patch Tokens for efficient Multi-Label Class-Incremental Learning [38.36863497458095]
We propose a new class-incremental learning method Multi-Label class incremental learning via summarising pAtch tokeN Embeddings (MULTI-LANE)
Our proposed method Multi-Label class incremental learning via summarising pAtch tokeN Embeddings (MULTI-LANE) enables learning disentangled task-specific representations in MLCIL while ensuring fast inference.
arXiv Detail & Related papers (2024-05-24T15:18:27Z) - Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task
Adaptation [45.90925587972781]
Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions.
Due to high computational costs, the current trend is to use prompt instruction tuning to better adjust monolithic, pretrained LLMs for new -- but often individual -- downstream tasks.
MoPs can simultaneously mitigate prompt training "interference" in multi-task, multi-source scenarios.
arXiv Detail & Related papers (2023-10-04T14:11:12Z) - Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation [2.024620791810963]
This study benchmarks the performance of Prompt Tuning and baselines for multi-label text classification.
It is applied to classifying companies into an investment firm's proprietary industry taxonomy.
We confirm that the model's performance is consistent across both well-known and less-known companies.
arXiv Detail & Related papers (2023-09-21T13:45:32Z) - M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning)
It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario.
Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z) - Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact
Supervision [53.530957567507365]
In some real-world tasks, each training sample is associated with a candidate label set that contains one ground-truth label and some false positive labels.
In this paper, we formalize such problems as multi-instance partial-label learning (MIPL)
Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems.
arXiv Detail & Related papers (2022-12-18T03:28:51Z) - PERFECT: Prompt-free and Efficient Few-shot Learning with Language
Models [67.3725459417758]
PERFECT is a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting.
We show that manually engineered task prompts can be replaced with task-specific adapters that enable sample-efficient fine-tuning.
Experiments on a wide range of few-shot NLP tasks demonstrate that PERFECT, while being simple and efficient, also outperforms existing state-of-the-art few-shot learning methods.
arXiv Detail & Related papers (2022-04-03T22:31:25Z) - Many-Class Few-Shot Learning on Multi-Granularity Class Hierarchy [57.68486382473194]
We study many-class few-shot (MCFS) problem in both supervised learning and meta-learning settings.
In this paper, we leverage the class hierarchy as a prior knowledge to train a coarse-to-fine classifier.
The model, "memory-augmented hierarchical-classification network (MahiNet)", performs coarse-to-fine classification where each coarse class can cover multiple fine classes.
arXiv Detail & Related papers (2020-06-28T01:11:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.