Related papers: Inverse is Better! Fast and Accurate Prompt for Few-shot Slot Tagging

Inverse is Better! Fast and Accurate Prompt for Few-shot Slot Tagging

URL: http://arxiv.org/abs/2204.00885v1
Date: Sat, 2 Apr 2022 15:41:19 GMT
Title: Inverse is Better! Fast and Accurate Prompt for Few-shot Slot Tagging
Authors: Yutai Hou, Cheng Chen, Xianzhen Luo, Bohan Li, Wanxiang Che
Abstract summary: We introduce an inverse paradigm for prompting. Different from the classic prompts mapping tokens to labels, we reversely predict slot values given slot types. We find, somewhat surprisingly, the proposed method not only predicts faster but also significantly improves the effect (improve over 6.1 F1-scores on 10-shot setting)
Score: 54.557406779183495
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Prompting methods recently achieve impressive success in few-shot learning. These methods modify input samples with prompt sentence pieces, and decode label tokens to map samples to corresponding labels. However, such a paradigm is very inefficient for the task of slot tagging. Since slot tagging samples are multiple consecutive words in a sentence, the prompting methods have to enumerate all n-grams token spans to find all the possible slots, which greatly slows down the prediction. To tackle this, we introduce an inverse paradigm for prompting. Different from the classic prompts mapping tokens to labels, we reversely predict slot values given slot types. Such inverse prompting only requires a one-turn prediction for each slot type and greatly speeds up the prediction. Besides, we propose a novel Iterative Prediction Strategy, from which the model learns to refine predictions by considering the relations between different slot types. We find, somewhat surprisingly, the proposed method not only predicts faster but also significantly improves the effect (improve over 6.1 F1-scores on 10-shot setting) and achieves new state-of-the-art performance.

Related papers

Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction [12.92060812931049]
Minor changes in prompt can cause significant discrepancies in model performance. We propose Placeholding Parallel Prediction (P3), a novel approach that predicts token probabilities across multiple positions. Experiments show improved accuracy and up to 98% reduction in the standard deviation across prompts.
arXiv Detail & Related papers (2025-04-04T04:39:51Z)
Improving Next Tokens via Second-Last Predictions with Generate and Refine [1.8592384822257952]
We train a decoder only architecture for predicting the second last token for a sequence of tokens. Our approach yields higher computational training efficiency than BERT-style models.
arXiv Detail & Related papers (2024-11-23T22:09:58Z)
Object Recognition as Next Token Prediction [99.40793702627396]
We present an approach to pose object recognition as next token prediction. The idea is to apply a language decoder that auto-regressively predicts the text tokens from image embeddings to form labels.
arXiv Detail & Related papers (2023-12-04T18:58:40Z)
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification [20.85088711770188]
We show that it is possible to improve prompt-based learning without additional labeled data. We propose Embroid, a method which computes multiple representations of a dataset under different embedding functions. We find that Embroid substantially improves performance over original prompts.
arXiv Detail & Related papers (2023-07-20T17:07:28Z)
Generative Zero-Shot Prompt Learning for Cross-Domain Slot Filling with Inverse Prompting [27.186526104248696]
Cross-domain slot filling aims to transfer knowledge from the labeled domain to the unlabeled target domain. We propose a generative zero-shot prompt learning framework for cross-domain slot filling. Experiments and analysis demonstrate the effectiveness of our proposed framework.
arXiv Detail & Related papers (2023-07-06T07:53:46Z)
TypeT5: Seq2seq Type Inference using Static Analysis [51.153089609654174]
We present a new type inference method that treats type prediction as a code infilling task. Our method uses static analysis to construct dynamic contexts for each code element whose type signature is to be predicted by the model. We also propose an iterative decoding scheme that incorporates previous type predictions in the model's input context.
arXiv Detail & Related papers (2023-03-16T23:48:00Z)
M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning) It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario. Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z)
WR-ONE2SET: Towards Well-Calibrated Keyphrase Generation [57.11538133231843]
Keyphrase generation aims to automatically generate short phrases summarizing an input document. The recently emerged ONE2SET paradigm generates keyphrases as a set and has achieved competitive performance. We propose WR-ONE2SET which extends ONE2SET with an adaptive instance-level cost Weighting strategy and a target Re-assignment mechanism.
arXiv Detail & Related papers (2022-11-13T09:56:24Z)
Mask-combine Decoding and Classification Approach for Punctuation Prediction with real-time Inference Constraints [10.75980867987981]
We unify several existing decoding strategies for punctuation prediction in one framework. We show that significant improvements can be achieved by optimising these strategies after training a model. We use our decoding strategy framework for the first comparison of tagging and classification approaches for punctuation prediction in a real-time setting.
arXiv Detail & Related papers (2021-12-15T13:14:36Z)
One-bit Supervision for Image Classification [121.87598671087494]
One-bit supervision is a novel setting of learning from incomplete annotations. We propose a multi-stage training paradigm which incorporates negative label suppression into an off-the-shelf semi-supervised learning algorithm.
arXiv Detail & Related papers (2020-09-14T03:06:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.