Data Augmentation for Conversational AI
- URL: http://arxiv.org/abs/2309.04739v2
- Date: Sat, 2 Mar 2024 23:14:47 GMT
- Title: Data Augmentation for Conversational AI
- Authors: Heydar Soudani, Evangelos Kanoulas and Faegheh Hasibi
- Abstract summary: Data augmentation (DA) is an affective approach to alleviate the data scarcity problem in conversational systems.
This tutorial provides a comprehensive and up-to-date overview of DA approaches in the context of conversational systems.
- Score: 17.48107304359591
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Advancements in conversational systems have revolutionized information
access, surpassing the limitations of single queries. However, developing
dialogue systems requires a large amount of training data, which is a challenge
in low-resource domains and languages. Traditional data collection methods like
crowd-sourcing are labor-intensive and time-consuming, making them ineffective
in this context. Data augmentation (DA) is an affective approach to alleviate
the data scarcity problem in conversational systems. This tutorial provides a
comprehensive and up-to-date overview of DA approaches in the context of
conversational systems. It highlights recent advances in conversation
augmentation, open domain and task-oriented conversation generation, and
different paradigms of evaluating these models. We also discuss current
challenges and future directions in order to help researchers and practitioners
to further advance the field in this area.
Related papers
- WavChat: A Survey of Spoken Dialogue Models [66.82775211793547]
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o, have captured significant attention in the speech domain.
These advanced spoken dialogue models not only comprehend audio, music, and other speech-related features, but also capture stylistic and timbral characteristics in speech.
Despite the progress in spoken dialogue systems, there is a lack of comprehensive surveys that systematically organize and analyze these systems.
arXiv Detail & Related papers (2024-11-15T04:16:45Z) - A Survey on Recent Advances in Conversational Data Generation [14.237954885530396]
We offer a systematic and comprehensive review of multi-turn conversational data generation.
We focus on three types of dialogue systems: open domain, task-oriented, and information-seeking.
We examine the evaluation metrics and methods for assessing synthetic conversational data.
arXiv Detail & Related papers (2024-05-12T10:11:12Z) - AutoConv: Automatically Generating Information-seeking Conversations
with Large Language Models [74.10293412011455]
We propose AutoConv for synthetic conversation generation.
Specifically, we formulate the conversation generation problem as a language modeling task.
We finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process.
arXiv Detail & Related papers (2023-08-12T08:52:40Z) - Collaborative Reasoning on Multi-Modal Semantic Graphs for
Video-Grounded Dialogue Generation [53.87485260058957]
We study video-grounded dialogue generation, where a response is generated based on the dialogue context and the associated video.
The primary challenges of this task lie in (1) the difficulty of integrating video data into pre-trained language models (PLMs)
We propose a multi-agent reinforcement learning method to collaboratively perform reasoning on different modalities.
arXiv Detail & Related papers (2022-10-22T14:45:29Z) - Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning [35.67318830455459]
We develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversational skill at scale.
Our work pairs the succinct embedding of the conversation state generated using SOTA (supervised) language models with RL techniques that are particularly suited to a dynamic action space.
arXiv Detail & Related papers (2022-07-25T16:12:33Z) - End-to-end Spoken Conversational Question Answering: Task, Dataset and
Model [92.18621726802726]
In spoken question answering, the systems are designed to answer questions from contiguous text spans within the related speech transcripts.
We propose a new Spoken Conversational Question Answering task (SCQA), aiming at enabling the systems to model complex dialogue flows.
Our main objective is to build the system to deal with conversational questions based on the audio recordings, and to explore the plausibility of providing more cues from different modalities with systems in information gathering.
arXiv Detail & Related papers (2022-04-29T17:56:59Z) - Automatic Evaluation and Moderation of Open-domain Dialogue Systems [59.305712262126264]
A long standing challenge that bothers the researchers is the lack of effective automatic evaluation metrics.
This paper describes the data, baselines and results obtained for the Track 5 at the Dialogue System Technology Challenge 10 (DSTC10)
arXiv Detail & Related papers (2021-11-03T10:08:05Z) - Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents.
We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z) - A Short Survey of Pre-trained Language Models for Conversational AI-A
NewAge in NLP [17.10418053437171]
Recently introduced pre-trained language models have the potential to address the issue of data scarcity.
These models have demonstrated to capture different facets of language such as hierarchical relations, long-term dependency, and sentiment.
This paper intends to establish whether these pre-trained models can overcome the challenges pertinent to dialogue systems.
arXiv Detail & Related papers (2021-04-22T01:00:56Z) - A Simple But Effective Approach to n-shot Task-Oriented Dialogue
Augmentation [32.43362825854633]
We introduce a framework that creates synthetic task-oriented dialogues in a fully automatic manner.
Our framework uses the simple idea that each turn-pair in a task-oriented dialogue has a certain function.
We observe significant improvements in the fine-tuning scenarios in several domains.
arXiv Detail & Related papers (2021-02-27T18:55:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.