Related papers: A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP

A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP

URL: http://arxiv.org/abs/2104.10810v1
Date: Thu, 22 Apr 2021 01:00:56 GMT
Title: A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP
Authors: Munazza Zaib and Quan Z. Sheng and Wei Emma Zhang
Abstract summary: Recently introduced pre-trained language models have the potential to address the issue of data scarcity. These models have demonstrated to capture different facets of language such as hierarchical relations, long-term dependency, and sentiment. This paper intends to establish whether these pre-trained models can overcome the challenges pertinent to dialogue systems.
Score: 17.10418053437171
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Building a dialogue system that can communicate naturally with humans is a challenging yet interesting problem of agent-based computing. The rapid growth in this area is usually hindered by the long-standing problem of data scarcity as these systems are expected to learn syntax, grammar, decision making, and reasoning from insufficient amounts of task-specific dataset. The recently introduced pre-trained language models have the potential to address the issue of data scarcity and bring considerable advantages by generating contextualized word embeddings. These models are considered counterpart of ImageNet in NLP and have demonstrated to capture different facets of language such as hierarchical relations, long-term dependency, and sentiment. In this short survey paper, we discuss the recent progress made in the field of pre-trained language models. We also deliberate that how the strengths of these language models can be leveraged in designing more engaging and more eloquent conversational agents. This paper, therefore, intends to establish whether these pre-trained models can overcome the challenges pertinent to dialogue systems, and how their architecture could be exploited in order to overcome these challenges. Open challenges in the field of dialogue systems have also been deliberated.

Related papers

Data Augmentation for Conversational AI [17.48107304359591]
Data augmentation (DA) is an affective approach to alleviate the data scarcity problem in conversational systems. This tutorial provides a comprehensive and up-to-date overview of DA approaches in the context of conversational systems.
arXiv Detail & Related papers (2023-09-09T09:56:35Z)
Neural Conversation Models and How to Rein Them in: A Survey of Failures and Fixes [17.489075240435348]
Recent conditional language models are able to continue any kind of text source in an often seemingly fluent way. From a linguistic perspective, contributing to a conversation is high. Recent approaches try to tame the underlying language models at various intervention points.
arXiv Detail & Related papers (2023-08-11T12:07:45Z)
Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world. The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time. The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z)
Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying. To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z)
Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial. We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z)
Opportunities and Challenges in Neural Dialog Tutoring [54.07241332881601]
We rigorously analyze various generative language models on two dialog tutoring datasets for language learning. We find that although current approaches can model tutoring in constrained learning scenarios, they perform poorly in less constrained scenarios. Our human quality evaluation shows that both models and ground-truth annotations exhibit low performance in terms of equitable tutoring.
arXiv Detail & Related papers (2023-01-24T11:00:17Z)
Robustness Testing of Language Understanding in Dialog Systems [33.30143655553583]
We conduct comprehensive evaluation and analysis with respect to the robustness of natural language understanding models. We introduce three important aspects related to language understanding in real-world dialog systems, namely, language variety, speech characteristics, and noise perturbation. We propose a model-agnostic toolkit LAUG to approximate natural perturbation for testing the robustness issues in dialog systems.
arXiv Detail & Related papers (2020-12-30T18:18:47Z)
Neurosymbolic AI for Situated Language Understanding [13.249453757295083]
We argue that computational situated grounding provides a solution to some of these learning challenges. Our model reincorporates some ideas of classic AI into a framework of neurosymbolic intelligence. We discuss how situated grounding provides diverse data and multiple levels of modeling for a variety of AI learning challenges.
arXiv Detail & Related papers (2020-12-05T05:03:28Z)
Knowledge Injection into Dialogue Generation via Language Models [85.65843021510521]
InjK is a two-stage approach to inject knowledge into a dialogue generation model. First, we train a large-scale language model and query it as textual knowledge. Second, we frame a dialogue generation model to sequentially generate textual knowledge and a corresponding response.
arXiv Detail & Related papers (2020-04-30T07:31:24Z)
Low-Resource Knowledge-Grounded Dialogue Generation [74.09352261943913]
We consider knowledge-grounded dialogue generation under a natural assumption that only limited training examples are available. We devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model. With only 1/8 training data, our model can achieve the state-of-the-art performance and generalize well on out-of-domain knowledge.
arXiv Detail & Related papers (2020-02-24T16:20:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.