TaDSE: Template-aware Dialogue Sentence Embeddings
- URL: http://arxiv.org/abs/2305.14299v1
- Date: Tue, 23 May 2023 17:40:41 GMT
- Title: TaDSE: Template-aware Dialogue Sentence Embeddings
- Authors: Minsik Oh, Jiwei Li, Guoyin Wang
- Abstract summary: General sentence embedding methods are usually sentence-level self-supervised frameworks and cannot utilize token-level extra knowledge.
TaDSE augments each sentence with its corresponding template and then conducts pairwise contrastive learning over both sentence and template.
Experiment results show that TaDSE achieves significant improvements over previous SOTA methods, along with a consistent Intent Classification task performance improvement margin.
- Score: 27.076663644996966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning high quality sentence embeddings from dialogues has drawn increasing
attentions as it is essential to solve a variety of dialogue-oriented tasks
with low annotation cost. However, directly annotating and gathering utterance
relationships in conversations are difficult, while token-level annotations,
\eg, entities, slots and templates, are much easier to obtain. General sentence
embedding methods are usually sentence-level self-supervised frameworks and
cannot utilize token-level extra knowledge. In this paper, we introduce
Template-aware Dialogue Sentence Embedding (TaDSE), a novel augmentation method
that utilizes template information to effectively learn utterance
representation via self-supervised contrastive learning framework. TaDSE
augments each sentence with its corresponding template and then conducts
pairwise contrastive learning over both sentence and template. We further
enhance the effect with a synthetically augmented dataset that enhances
utterance-template relation, in which entity detection (slot-filling) is a
preliminary step. We evaluate TaDSE performance on five downstream benchmark
datasets. The experiment results show that TaDSE achieves significant
improvements over previous SOTA methods, along with a consistent Intent
Classification task performance improvement margin. We further introduce a
novel analytic instrument of Semantic Compression method, for which we discover
a correlation with uniformity and alignment. Our code will be released soon.
Related papers
- Factual Dialogue Summarization via Learning from Large Language Models [35.63037083806503]
Large language model (LLM)-based automatic text summarization models generate more factually consistent summaries.
We employ zero-shot learning to extract symbolic knowledge from LLMs, generating factually consistent (positive) and inconsistent (negative) summaries.
Our approach achieves better factual consistency while maintaining coherence, fluency, and relevance, as confirmed by various automatic evaluation metrics.
arXiv Detail & Related papers (2024-06-20T20:03:37Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - DialAug: Mixing up Dialogue Contexts in Contrastive Learning for Robust
Conversational Modeling [3.3578533367912025]
We propose a framework that incorporates augmented versions of a dialogue context into the learning objective.
We show that our proposed augmentation method outperforms previous data augmentation approaches.
arXiv Detail & Related papers (2022-04-15T23:39:41Z) - Utterance Rewriting with Contrastive Learning in Multi-turn Dialogue [22.103162555263143]
We introduce contrastive learning and multi-task learning to jointly model the problem.
Our proposed model achieves state-of-the-art performance on several public datasets.
arXiv Detail & Related papers (2022-03-22T10:13:27Z) - DialogueCSE: Dialogue-based Contrastive Learning of Sentence Embeddings [33.89889949577356]
We propose DialogueCSE, a dialogue-based contrastive learning approach to tackle this issue.
We evaluate our model on three multi-turn dialogue datasets: the Microsoft Dialogue Corpus, the Jing Dong Dialogue Corpus, and the E-commerce Dialogue Corpus.
arXiv Detail & Related papers (2021-09-26T13:25:41Z) - Syntax-Enhanced Pre-trained Model [49.1659635460369]
We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa.
Existing methods utilize syntax of text either in the pre-training stage or in the fine-tuning stage, so that they suffer from discrepancy between the two stages.
We present a model that utilizes the syntax of text in both pre-training and fine-tuning stages.
arXiv Detail & Related papers (2020-12-28T06:48:04Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - Introducing Syntactic Structures into Target Opinion Word Extraction
with Deep Learning [89.64620296557177]
We propose to incorporate the syntactic structures of the sentences into the deep learning models for targeted opinion word extraction.
We also introduce a novel regularization technique to improve the performance of the deep learning models.
The proposed model is extensively analyzed and achieves the state-of-the-art performance on four benchmark datasets.
arXiv Detail & Related papers (2020-10-26T07:13:17Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.