Can Your Model Tell a Negation from an Implicature? Unravelling
Challenges With Intent Encoders
- URL: http://arxiv.org/abs/2403.04314v1
- Date: Thu, 7 Mar 2024 08:32:17 GMT
- Title: Can Your Model Tell a Negation from an Implicature? Unravelling
Challenges With Intent Encoders
- Authors: Yuwei Zhang, Siffi Singh, Sailik Sengupta, Igor Shalyminov, Hang Su,
Hwanjun Song, Saab Mansour
- Abstract summary: Large Language Models (LLMs) enable embeddings allowing one to adjust semantics over the embedding space using prompts.
Traditional evaluation benchmarks rely solely on task metrics that don't particularly measure gaps related to semantic understanding.
We propose an intent semantic toolkit that gives a more holistic view of intent embedding models.
- Score: 24.42199777529863
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Conversational systems often rely on embedding models for intent
classification and intent clustering tasks. The advent of Large Language Models
(LLMs), which enable instructional embeddings allowing one to adjust semantics
over the embedding space using prompts, are being viewed as a panacea for these
downstream conversational tasks. However, traditional evaluation benchmarks
rely solely on task metrics that don't particularly measure gaps related to
semantic understanding. Thus, we propose an intent semantic toolkit that gives
a more holistic view of intent embedding models by considering three tasks--
(1) intent classification, (2) intent clustering, and (3) a novel triplet task.
The triplet task gauges the model's understanding of two semantic concepts
paramount in real-world conversational systems-- negation and implicature. We
observe that current embedding models fare poorly in semantic understanding of
these concepts. To address this, we propose a pre-training approach to improve
the embedding model by leveraging augmentation with data generated by an
auto-regressive model and a contrastive loss term. Our approach improves the
semantic understanding of the intent embedding model on the aforementioned
linguistic dimensions while slightly effecting their performance on downstream
task metrics.
Related papers
- Collapsed Language Models Promote Fairness [88.48232731113306]
We find that debiased language models exhibit collapsed alignment between token representations and word embeddings.
We design a principled fine-tuning method that can effectively improve fairness in a wide range of debiasing methods.
arXiv Detail & Related papers (2024-10-06T13:09:48Z) - Integrating Self-supervised Speech Model with Pseudo Word-level Targets
from Visually-grounded Speech Model [57.78191634042409]
We propose Pseudo-Word HuBERT (PW-HuBERT), a framework that integrates pseudo word-level targets into the training process.
Our experimental results on four spoken language understanding (SLU) benchmarks suggest the superiority of our model in capturing semantic information.
arXiv Detail & Related papers (2024-02-08T16:55:21Z) - Co-guiding for Multi-intent Spoken Language Understanding [53.30511968323911]
We propose a novel model termed Co-guiding Net, which implements a two-stage framework achieving the mutual guidances between the two tasks.
For the first stage, we propose single-task supervised contrastive learning, and for the second stage, we propose co-guiding supervised contrastive learning.
Experiment results on multi-intent SLU show that our model outperforms existing models by a large margin.
arXiv Detail & Related papers (2023-11-22T08:06:22Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - A Simple Meta-learning Paradigm for Zero-shot Intent Classification with
Mixture Attention Mechanism [17.228616743739412]
We propose a simple yet effective meta-learning paradigm for zero-shot intent classification.
To learn better semantic representations for utterances, we introduce a new mixture attention mechanism.
To strengthen the transfer ability of the model from seen classes to unseen classes, we reformulate zero-shot intent classification with a meta-learning strategy.
arXiv Detail & Related papers (2022-06-05T13:37:51Z) - Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning [57.4036085386653]
We show that prompt-based models for sentence pair classification tasks still suffer from a common pitfall of adopting inferences based on lexical overlap.
We then show that adding a regularization that preserves pretraining weights is effective in mitigating this destructive tendency of few-shot finetuning.
arXiv Detail & Related papers (2021-09-09T10:10:29Z) - A Minimalist Dataset for Systematic Generalization of Perception,
Syntax, and Semantics [131.93113552146195]
We present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts.
In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images.
We undertake extensive experiments with various sequence-to-sequence models, including RNNs, Transformers, and GPT-3.
arXiv Detail & Related papers (2021-03-02T01:32:54Z) - Generalized Zero-shot Intent Detection via Commonsense Knowledge [5.398580049917152]
We propose RIDE: an intent detection model that leverages commonsense knowledge in an unsupervised fashion to overcome the issue of training data scarcity.
RIDE computes robust and generalizable relationship meta-features that capture deep semantic relationships between utterances and intent labels.
Our extensive experimental analysis on three widely-used intent detection benchmarks shows that relationship meta-features significantly increase the accuracy of detecting both seen and unseen intents.
arXiv Detail & Related papers (2021-02-04T23:36:41Z) - Example-Driven Intent Prediction with Observers [15.615065041164629]
We focus on the intent classification problem which aims to identify user intents given utterances addressed to the dialog system.
We propose two approaches for improving the generalizability of utterance classification models: (1) observers and (2) example-driven training.
arXiv Detail & Related papers (2020-10-17T01:03:06Z) - Toward Interpretability of Dual-Encoder Models for Dialogue Response
Suggestions [18.117115200484708]
We present an attentive dual encoder model that includes an attention mechanism on top of the extracted word-level features from two encoders.
We design a novel regularization loss to minimize the mutual information between unimportant words and desired labels.
Experiments demonstrate the effectiveness of the proposed model in terms of better Recall@1 accuracy and visualized interpretability.
arXiv Detail & Related papers (2020-03-02T21:26:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.