Related papers: Data Augmentation for Conversational AI

Data Augmentation for Conversational AI

URL: http://arxiv.org/abs/2309.04739v2
Date: Sat, 2 Mar 2024 23:14:47 GMT
Title: Data Augmentation for Conversational AI
Authors: Heydar Soudani, Evangelos Kanoulas and Faegheh Hasibi
Abstract summary: Data augmentation (DA) is an affective approach to alleviate the data scarcity problem in conversational systems. This tutorial provides a comprehensive and up-to-date overview of DA approaches in the context of conversational systems.
Score: 17.48107304359591
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Advancements in conversational systems have revolutionized information access, surpassing the limitations of single queries. However, developing dialogue systems requires a large amount of training data, which is a challenge in low-resource domains and languages. Traditional data collection methods like crowd-sourcing are labor-intensive and time-consuming, making them ineffective in this context. Data augmentation (DA) is an affective approach to alleviate the data scarcity problem in conversational systems. This tutorial provides a comprehensive and up-to-date overview of DA approaches in the context of conversational systems. It highlights recent advances in conversation augmentation, open domain and task-oriented conversation generation, and different paradigms of evaluating these models. We also discuss current challenges and future directions in order to help researchers and practitioners to further advance the field in this area.

Related papers

From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System [49.57258257916805]
Large Language Models (LLMs) demonstrate strong zero-shot recommendation capabilities. Practical applications often favor smaller, internally managed recommender models due to scalability, interpretability, and data privacy constraints. We propose an active data augmentation framework that synthesizes conversational training data by leveraging black-box LLMs guided by active learning techniques.
arXiv Detail & Related papers (2025-04-21T23:05:47Z)
WavChat: A Survey of Spoken Dialogue Models [66.82775211793547]
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o, have captured significant attention in the speech domain. These advanced spoken dialogue models not only comprehend audio, music, and other speech-related features, but also capture stylistic and timbral characteristics in speech. Despite the progress in spoken dialogue systems, there is a lack of comprehensive surveys that systematically organize and analyze these systems.
arXiv Detail & Related papers (2024-11-15T04:16:45Z)
A Survey on Recent Advances in Conversational Data Generation [14.237954885530396]
We offer a systematic and comprehensive review of multi-turn conversational data generation. We focus on three types of dialogue systems: open domain, task-oriented, and information-seeking. We examine the evaluation metrics and methods for assessing synthetic conversational data.
arXiv Detail & Related papers (2024-05-12T10:11:12Z)
AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models [74.10293412011455]
We propose AutoConv for synthetic conversation generation. Specifically, we formulate the conversation generation problem as a language modeling task. We finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process.
arXiv Detail & Related papers (2023-08-12T08:52:40Z)
Collaborative Reasoning on Multi-Modal Semantic Graphs for Video-Grounded Dialogue Generation [53.87485260058957]
We study video-grounded dialogue generation, where a response is generated based on the dialogue context and the associated video. The primary challenges of this task lie in (1) the difficulty of integrating video data into pre-trained language models (PLMs) We propose a multi-agent reinforcement learning method to collaboratively perform reasoning on different modalities.
arXiv Detail & Related papers (2022-10-22T14:45:29Z)
Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning [35.67318830455459]
We develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversational skill at scale. Our work pairs the succinct embedding of the conversation state generated using SOTA (supervised) language models with RL techniques that are particularly suited to a dynamic action space.
arXiv Detail & Related papers (2022-07-25T16:12:33Z)
End-to-end Spoken Conversational Question Answering: Task, Dataset and Model [92.18621726802726]
In spoken question answering, the systems are designed to answer questions from contiguous text spans within the related speech transcripts. We propose a new Spoken Conversational Question Answering task (SCQA), aiming at enabling the systems to model complex dialogue flows. Our main objective is to build the system to deal with conversational questions based on the audio recordings, and to explore the plausibility of providing more cues from different modalities with systems in information gathering.
arXiv Detail & Related papers (2022-04-29T17:56:59Z)
Automatic Evaluation and Moderation of Open-domain Dialogue Systems [59.305712262126264]
A long standing challenge that bothers the researchers is the lack of effective automatic evaluation metrics. This paper describes the data, baselines and results obtained for the Track 5 at the Dialogue System Technology Challenge 10 (DSTC10)
arXiv Detail & Related papers (2021-11-03T10:08:05Z)
Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents. We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z)
A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP [17.10418053437171]
Recently introduced pre-trained language models have the potential to address the issue of data scarcity. These models have demonstrated to capture different facets of language such as hierarchical relations, long-term dependency, and sentiment. This paper intends to establish whether these pre-trained models can overcome the challenges pertinent to dialogue systems.
arXiv Detail & Related papers (2021-04-22T01:00:56Z)
A Simple But Effective Approach to n-shot Task-Oriented Dialogue Augmentation [32.43362825854633]
We introduce a framework that creates synthetic task-oriented dialogues in a fully automatic manner. Our framework uses the simple idea that each turn-pair in a task-oriented dialogue has a certain function. We observe significant improvements in the fine-tuning scenarios in several domains.
arXiv Detail & Related papers (2021-02-27T18:55:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.