Related papers: Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation

Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation

URL: http://arxiv.org/abs/2309.12075v3
Date: Fri, 12 Apr 2024 12:25:50 GMT
Title: Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation
Authors: Valentin Leonhard Buchner, Lele Cao, Jan-Christoph Kalo, Vilhelm von Ehrenheim,
Abstract summary: This study benchmarks the performance of Prompt Tuning and baselines for multi-label text classification. It is applied to classifying companies into an investment firm's proprietary industry taxonomy. We confirm that the model's performance is consistent across both well-known and less-known companies.
Score: 2.024620791810963
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prompt Tuning is emerging as a scalable and cost-effective method to fine-tune Pretrained Language Models (PLMs), which are often referred to as Large Language Models (LLMs). This study benchmarks the performance and computational efficiency of Prompt Tuning and baselines for multi-label text classification. This is applied to the challenging task of classifying companies into an investment firm's proprietary industry taxonomy, supporting their thematic investment strategy. Text-to-text classification is frequently reported to outperform task-specific classification heads, but has several limitations when applied to a multi-label classification problem where each label consists of multiple tokens: (a) Generated labels may not match any label in the label taxonomy; (b) The fine-tuning process lacks permutation invariance and is sensitive to the order of the provided labels; (c) The model provides binary decisions rather than appropriate confidence scores. Limitation (a) is addressed by applying constrained decoding using Trie Search, which slightly improves classification performance. All limitations (a), (b), and (c) are addressed by replacing the PLM's language head with a classification head, which is referred to as Prompt Tuned Embedding Classification (PTEC). This improves performance significantly, while also reducing computational costs during inference. In our industrial application, the training data is skewed towards well-known companies. We confirm that the model's performance is consistent across both well-known and less-known companies. Our overall results indicate the continuing need to adapt state-of-the-art methods to domain-specific tasks, even in the era of PLMs with strong generalization abilities. We release our codebase and a benchmarking dataset at https://github.com/EQTPartners/PTEC.

Related papers

A Semi-supervised Scalable Unified Framework for E-commerce Query Classification [13.695419069287482]
E-commerce queries are usually short and lack context, and the information between labels cannot be used.<n>Most existing industrial query classification methods rely on users' posterior click behavior to construct training samples, resulting in a Matthew vicious cycle.<n>We propose a novel Semi-supervised Scalable Unified Framework (SSUF), containing multiple enhanced modules to unify the query classification tasks.
arXiv Detail & Related papers (2025-06-26T06:52:33Z)
Can Large Language Models Serve as Effective Classifiers for Hierarchical Multi-Label Classification of Scientific Documents at Industrial Scale? [1.0562108865927007]
Large Language Models (LLMs) have demonstrated great potential in complex tasks such as multi-label classification. We present methods that combine the strengths of LLMs with dense retrieval techniques to overcome these challenges. We evaluate the effectiveness of our methods on SSRN, a large repository of preprints spanning multiple disciplines.
arXiv Detail & Related papers (2024-12-06T15:51:22Z)
RulePrompt: Weakly Supervised Text Classification with Prompting PLMs and Self-Iterative Logical Rules [30.239044569301534]
Weakly supervised text classification (WSTC) has attracted increasing attention due to its applicability in classifying a mass of texts. We propose a prompting PLM-based approach named RulePrompt for the WSTC task, consisting of a rule mining module and a rule-enhanced pseudo label generation module. Our approach yields interpretable category rules, proving its advantage in disambiguating easily-confused categories.
arXiv Detail & Related papers (2024-03-05T12:50:36Z)
SemiReward: A General Reward Model for Semi-supervised Learning [58.47299780978101]
Semi-supervised learning (SSL) has witnessed great progress with various improvements in the self-training framework with pseudo labeling. Main challenge is how to distinguish high-quality pseudo labels against the confirmation bias. We propose a Semi-supervised Reward framework (SemiReward) that predicts reward scores to evaluate and filter out high-quality pseudo labels.
arXiv Detail & Related papers (2023-10-04T17:56:41Z)
ProTeCt: Prompt Tuning for Taxonomic Open Set Classification [59.59442518849203]
Few-shot adaptation methods do not fare well in the taxonomic open set (TOS) setting. We propose a prompt tuning technique that calibrates the hierarchical consistency of model predictions. A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities.
arXiv Detail & Related papers (2023-06-04T02:55:25Z)
M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning) It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario. Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z)
AutoWS: Automated Weak Supervision Framework for Text Classification [1.748907524043535]
We propose a novel framework for increasing the efficiency of weak supervision process while decreasing the dependency on domain experts. Our method requires a small set of labeled examples per label class and automatically creates a set of labeling functions to assign noisy labels to numerous unlabeled data.
arXiv Detail & Related papers (2023-02-07T07:12:05Z)
CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class Classification [57.62886091828512]
We propose a brand-new prefix-tuning method, Counterfactual Contrastive Prefix-tuning (CCPrefix) for many-class classification. Basically, an instance-dependent soft prefix, derived from fact-counterfactual pairs in the label space, is leveraged to complement the language verbalizers in many-class classification.
arXiv Detail & Related papers (2022-11-11T03:45:59Z)
Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler [89.27610526884496]
Weak Labeler Active Cover (WL-AC) is able to robustly leverage the lower quality weak labelers to reduce the query complexity while retaining the desired level of accuracy. We show its effectiveness on the corrupted-MNIST dataset by significantly reducing the number of labels while keeping the same accuracy as in passive learning.
arXiv Detail & Related papers (2022-11-04T02:52:54Z)
Rank over Class: The Untapped Potential of Ranking in Natural Language Processing [8.637110868126546]
We argue that many tasks which are currently addressed using classification are in fact being shoehorned into a classification mould. We propose a novel end-to-end ranking approach consisting of a Transformer network responsible for producing representations for a pair of text sequences. In an experiment on a heavily-skewed sentiment analysis dataset, converting ranking results to classification labels yields an approximately 22% improvement over state-of-the-art text classification.
arXiv Detail & Related papers (2020-09-10T22:18:57Z)
Unsupervised Person Re-identification via Multi-label Classification [55.65870468861157]
This paper formulates unsupervised person ReID as a multi-label classification task to progressively seek true labels. Our method starts by assigning each person image with a single-class label, then evolves to multi-label classification by leveraging the updated ReID model for label prediction. To boost the ReID model training efficiency in multi-label classification, we propose the memory-based multi-label classification loss (MMCL)
arXiv Detail & Related papers (2020-04-20T12:13:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.