Related papers: Injecting linguistic knowledge into BERT for Dialogue State Tracking

Injecting linguistic knowledge into BERT for Dialogue State Tracking

URL: http://arxiv.org/abs/2311.15623v3
Date: Wed, 3 Jul 2024 02:59:32 GMT
Title: Injecting linguistic knowledge into BERT for Dialogue State Tracking
Authors: Xiaohan Feng, Xixin Wu, Helen Meng,
Abstract summary: This paper proposes a method that extracts linguistic knowledge via an unsupervised framework. We then utilize this knowledge to augment BERT's performance and interpretability in Dialogue State Tracking (DST) tasks. We benchmark this framework on various DST tasks and observe a notable improvement in accuracy.
Score: 60.42231674887294
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Dialogue State Tracking (DST) models often employ intricate neural network architectures, necessitating substantial training data, and their inference process lacks transparency. This paper proposes a method that extracts linguistic knowledge via an unsupervised framework and subsequently utilizes this knowledge to augment BERT's performance and interpretability in DST tasks. The knowledge extraction procedure is computationally economical and does not require annotations or additional training data. The injection of the extracted knowledge can be achieved by the addition of simple neural modules. We employ the Convex Polytopic Model (CPM) as a feature extraction tool for DST tasks and illustrate that the acquired features correlate with syntactic and semantic patterns in the dialogues. This correlation facilitates a comprehensive understanding of the linguistic features influencing the DST model's decision-making process. We benchmark this framework on various DST tasks and observe a notable improvement in accuracy.

Related papers

On the Loss of Context-awareness in General Instruction Fine-tuning [101.03941308894191]
We investigate the loss of context awareness after supervised fine-tuning. We find that the performance decline is associated with a bias toward different roles learned during conversational instruction fine-tuning. We propose a metric to identify context-dependent examples from general instruction fine-tuning datasets.
arXiv Detail & Related papers (2024-11-05T00:16:01Z)
Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning. This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z)
CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.<n>Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)<n>We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z)
From Dialogue to Diagram: Task and Relationship Extraction from Natural Language for Accelerated Business Process Prototyping [0.0]
This paper introduces a contemporary solution, where central to our approach, is the use of dependency parsing and Named Entity Recognition (NER) We utilize Subject-Verb-Object (SVO) constructs for identifying action relationships and integrate semantic analysis tools, including WordNet, for enriched contextual understanding. The system adeptly handles data transformation and visualization, converting verbose extracted information into BPMN (Business Process Model and Notation) diagrams.
arXiv Detail & Related papers (2023-12-16T12:35:28Z)
Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction. RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z)
A Study on Prompt-based Few-Shot Learning Methods for Belief State Tracking in Task-oriented Dialog Systems [10.024834304960846]
We tackle the Dialogue Belief State Tracking problem of task-oriented conversational systems. Recent approaches to this problem leveraging Transformer-based models have yielded great results. We explore prompt-based few-shot learning for Dialogue Belief State Tracking.
arXiv Detail & Related papers (2022-04-18T05:29:54Z)
Prompt Learning for Few-Shot Dialogue State Tracking [75.50701890035154]
This paper focuses on how to learn a dialogue state tracking (DST) model efficiently with limited labeled data. We design a prompt learning framework for few-shot DST, which consists of two main components: value-based prompt and inverse prompt mechanism. Experiments show that our model can generate unseen slots and outperforms existing state-of-the-art few-shot methods.
arXiv Detail & Related papers (2022-01-15T07:37:33Z)
On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations [11.558645364193486]
In this paper, we analyze the evolution of the embedded syntax trees along the fine-tuning process of BERT for six different tasks. Experimental results show that the encoded information is forgotten (PoS tagging), reinforced (dependency and constituency parsing) or preserved (semantics-related tasks) in different ways along the fine-tuning process depending on the task.
arXiv Detail & Related papers (2021-01-27T15:41:09Z)
Exploring Software Naturalness through Neural Language Models [56.1315223210742]
The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing. We explore this hypothesis through the use of a pre-trained transformer-based language model to perform code analysis tasks.
arXiv Detail & Related papers (2020-06-22T21:56:14Z)
Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT [29.04485839262945]
We propose a parameter-free probing technique for analyzing pre-trained language models (e.g., BERT) Our method does not require direct supervision from the probing tasks, nor do we introduce additional parameters to the probing process. Our experiments on BERT show that syntactic trees recovered from BERT using our method are significantly better than linguistically-uninformed baselines.
arXiv Detail & Related papers (2020-04-30T14:02:29Z)
A Dependency Syntactic Knowledge Augmented Interactive Architecture for End-to-End Aspect-based Sentiment Analysis [73.74885246830611]
We propose a novel dependency syntactic knowledge augmented interactive architecture with multi-task learning for end-to-end ABSA. This model is capable of fully exploiting the syntactic knowledge (dependency relations and types) by leveraging a well-designed Dependency Relation Embedded Graph Convolutional Network (DreGcn) Extensive experimental results on three benchmark datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-04T14:59:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.