Slot Induction via Pre-trained Language Model Probing and Multi-level
Contrastive Learning
- URL: http://arxiv.org/abs/2308.04712v1
- Date: Wed, 9 Aug 2023 05:08:57 GMT
- Title: Slot Induction via Pre-trained Language Model Probing and Multi-level
Contrastive Learning
- Authors: Hoang H. Nguyen, Chenwei Zhang, Ye Liu, Philip S. Yu
- Abstract summary: Slot Induction (SI) task whose objective is to induce slot boundaries without explicit knowledge of token-level slot annotations.
We propose leveraging Unsupervised Pre-trained Language Model (PLM) Probing and Contrastive Learning mechanism to exploit unsupervised semantic knowledge extracted from PLM.
Our approach is shown to be effective in SI task and capable of bridging the gaps with token-level supervised models on two NLU benchmark datasets.
- Score: 62.839109775887025
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advanced methods in Natural Language Understanding for Task-oriented
Dialogue (TOD) Systems (e.g., intent detection and slot filling) require a
large amount of annotated data to achieve competitive performance. In reality,
token-level annotations (slot labels) are time-consuming and difficult to
acquire. In this work, we study the Slot Induction (SI) task whose objective is
to induce slot boundaries without explicit knowledge of token-level slot
annotations. We propose leveraging Unsupervised Pre-trained Language Model
(PLM) Probing and Contrastive Learning mechanism to exploit (1) unsupervised
semantic knowledge extracted from PLM, and (2) additional sentence-level intent
label signals available from TOD. Our approach is shown to be effective in SI
task and capable of bridging the gaps with token-level supervised models on two
NLU benchmark datasets. When generalized to emerging intents, our SI objectives
also provide enhanced slot label representations, leading to improved
performance on the Slot Filling tasks.
Related papers
- Unified Unsupervised Salient Object Detection via Knowledge Transfer [29.324193170890542]
Unsupervised salient object detection (USOD) has gained increasing attention due to its annotation-free nature.
In this paper, we propose a unified USOD framework for generic USOD tasks.
arXiv Detail & Related papers (2024-04-23T05:50:02Z) - Introducing "Forecast Utterance" for Conversational Data Science [2.3894779000840503]
This paper introduces a new concept called Forecast Utterance.
We then focus on the automatic and accurate interpretation of users' prediction goals from these utterances.
Specifically, we frame the task as a slot-filling problem, where each slot corresponds to a specific aspect of the goal prediction task.
We then employ two zero-shot methods for solving the slot-filling task, namely: 1) Entity Extraction (EE), and 2) Question-Answering (QA) techniques.
arXiv Detail & Related papers (2023-09-07T17:41:41Z) - Transfer-Free Data-Efficient Multilingual Slot Labeling [82.02076369811402]
Slot labeling is a core component of task-oriented dialogue (ToD) systems.
To mitigate the inherent data scarcity issue, current research on multilingual ToD assumes that sufficient English-language annotated data are always available.
We propose a two-stage slot labeling approach (termed TWOSL) which transforms standard multilingual sentence encoders into effective slot labelers.
arXiv Detail & Related papers (2023-05-22T22:47:32Z) - Automated Few-shot Classification with Instruction-Finetuned Language
Models [76.69064714392165]
We show that AuT-Few outperforms state-of-the-art few-shot learning methods.
We also show that AuT-Few is the best ranking method across datasets on the RAFT few-shot benchmark.
arXiv Detail & Related papers (2023-05-21T21:50:27Z) - Toward Open-domain Slot Filling via Self-supervised Co-training [2.7178968279054936]
Slot filling is one of the critical tasks in modern conversational systems.
We propose a Self-supervised Co-training framework, called SCot, that requires zero in-domain manually labeled training examples.
Our evaluations show that SCot outperforms state-of-the-art models by 45.57% and 37.56% on SGD and MultiWoZ datasets.
arXiv Detail & Related papers (2023-03-24T04:51:22Z) - Bi-directional Joint Neural Networks for Intent Classification and Slot
Filling [5.3361357265365035]
We propose a bi-directional joint model for intent classification and slot filling.
Our model achieves state-of-the-art results on intent classification accuracy, slot filling F1, and significantly improves sentence-level semantic frame accuracy.
arXiv Detail & Related papers (2022-02-26T06:35:21Z) - Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for
Open-Set Semi-Supervised Learning [101.28281124670647]
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data.
We propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning.
Our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-08-12T09:14:44Z) - PIN: A Novel Parallel Interactive Network for Spoken Language
Understanding [68.53121591998483]
In the existing RNN-based approaches, ID and SF tasks are often jointly modeled to utilize the correlation information between them.
The experiments on two benchmark datasets, i.e., SNIPS and ATIS, demonstrate the effectiveness of our approach.
More encouragingly, by using the feature embedding of the utterance generated by the pre-trained language model BERT, our method achieves the state-of-the-art among all comparison approaches.
arXiv Detail & Related papers (2020-09-28T15:59:31Z) - An Information Bottleneck Approach for Controlling Conciseness in
Rationale Extraction [84.49035467829819]
We show that it is possible to better manage this trade-off by optimizing a bound on the Information Bottleneck (IB) objective.
Our fully unsupervised approach jointly learns an explainer that predicts sparse binary masks over sentences, and an end-task predictor that considers only the extracted rationale.
arXiv Detail & Related papers (2020-05-01T23:26:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.