Related papers: Extracting Training Dialogue Data from Large Language Model based Task Bots

Extracting Training Dialogue Data from Large Language Model based Task Bots

URL: http://arxiv.org/abs/2603.01550v1
Date: Mon, 02 Mar 2026 07:25:04 GMT
Title: Extracting Training Dialogue Data from Large Language Model based Task Bots
Authors: Shuo Zhang, Junzhou Zhao, Junji Hou, Pinghui Wang, Chenxu Wang, Jing Tao,
Abstract summary: Large Language Models (LLMs) have been widely adopted to enhance Task-Oriented Dialogue Systems (TODS)<n>LLMs function as soft knowledge bases that compress extensive training data into rich knowledge representations.<n>LLMs can inadvertently memorize training dialogue data containing identifiable information such as phone numbers.
Score: 23.896561852220984
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have been widely adopted to enhance Task-Oriented Dialogue Systems (TODS) by modeling complex language patterns and delivering contextually appropriate responses. However, this integration introduces significant privacy risks, as LLMs, functioning as soft knowledge bases that compress extensive training data into rich knowledge representations, can inadvertently memorize training dialogue data containing not only identifiable information such as phone numbers but also entire dialogue-level events like complete travel schedules. Despite the critical nature of this privacy concern, how LLM memorization is inherited in developing task bots remains unexplored. In this work, we address this gap through a systematic quantitative study that involves evaluating existing training data extraction attacks, analyzing key characteristics of task-oriented dialogue modeling that render existing methods ineffective, and proposing novel attack techniques tailored for LLM-based TODS that enhance both response sampling and membership inference. Experimental results demonstrate the effectiveness of our proposed data extraction attack. Our method can extract thousands of training labels of dialogue states with best-case precision exceeding 70%. Furthermore, we provide an in-depth analysis of training data memorization in LLM-based TODS by identifying and quantifying key influencing factors and discussing targeted mitigation strategies.

Related papers

On the Effectiveness of Membership Inference in Targeted Data Extraction from Large Language Models [3.1988753364712115]
Large Language Models (LLMs) are prone to mem- orizing training data, which poses serious privacy risks.<n>In this study, we integrate multiple MIA techniques into the data extraction pipeline to systematically benchmark their effectiveness.
arXiv Detail & Related papers (2025-12-15T14:05:49Z)
The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation [97.0658685969199]
Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, yet they also exhibit memorization of their training data.<n>This paper synthesizes recent studies and investigates the landscape of memorization, the factors influencing it, and methods for its detection and mitigation.
arXiv Detail & Related papers (2025-07-08T01:30:46Z)
Analyzing Mitigation Strategies for Catastrophic Forgetting in End-to-End Training of Spoken Language Models [79.90523648823522]
Multi-stage continual learning can lead to catastrophic forgetting.<n>This paper evaluates three mitigation strategies-model merging, discounting the LoRA scaling factor, and experience replay.<n>Results show that experience replay is the most effective, with further gains achieved by combining it with other methods.
arXiv Detail & Related papers (2025-05-23T05:50:14Z)
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation [88.78166077081912]
We introduce a multimodal unlearning benchmark, UnLOK-VQA, and an attack-and-defense framework to evaluate methods for deleting specific multimodal knowledge from MLLMs.<n>Our results show multimodal attacks outperform text- or image-only ones, and that the most effective defense removes answer information from internal model states.
arXiv Detail & Related papers (2025-05-01T01:54:00Z)
From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System [49.57258257916805]
Large Language Models (LLMs) demonstrate strong zero-shot recommendation capabilities.<n>Practical applications often favor smaller, internally managed recommender models due to scalability, interpretability, and data privacy constraints.<n>We propose an active data augmentation framework that synthesizes conversational training data by leveraging black-box LLMs guided by active learning techniques.
arXiv Detail & Related papers (2025-04-21T23:05:47Z)
Beyond Binary: Towards Fine-Grained LLM-Generated Text Detection via Role Recognition and Involvement Measurement [51.601916604301685]
Large language models (LLMs) generate content that can undermine trust in online discourse.<n>Current methods often focus on binary classification, failing to address the complexities of real-world scenarios like human-LLM collaboration.<n>To move beyond binary classification and address these challenges, we propose a new paradigm for detecting LLM-generated content.
arXiv Detail & Related papers (2024-10-18T08:14:10Z)
Undesirable Memorization in Large Language Models: A Survey [5.659933808910005]
memorization refers to a model's tendency to store and reproduce phrases from its training data.<n>This paper provides a taxonomy of the literature on LLM memorization, exploring it across three dimensions: granularity, retrievability, and desirability.<n>We conclude our survey by identifying potential research topics for the near future, including methods to balance privacy and performance.
arXiv Detail & Related papers (2024-10-03T16:34:46Z)
A Universal Prompting Strategy for Extracting Process Model Information from Natural Language Text using Large Language Models [0.8899670429041453]
We show that generative large language models (LLMs) can solve NLP tasks with very high quality without the need for extensive data. Based on a novel prompting strategy, we show that LLMs are able to outperform state-of-the-art machine learning approaches.
arXiv Detail & Related papers (2024-07-26T06:39:35Z)
C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations. Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z)
Quantifying and Analyzing Entity-level Memorization in Large Language Models [4.59914731734176]
Large language models (LLMs) have been proven capable of memorizing their training data. Privacy risks arising from memorization have attracted increasing attention. We propose a fine-grained, entity-level definition to quantify memorization with conditions and metrics closer to real-world scenarios.
arXiv Detail & Related papers (2023-08-30T03:06:47Z)
Mitigating Approximate Memorization in Language Models via Dissimilarity Learned Policy [0.0]
Large Language models (LLMs) are trained on large amounts of data. LLMs showed to memorize parts of the training data and emit those data verbatim when an adversary prompts appropriately.
arXiv Detail & Related papers (2023-05-02T15:53:28Z)
Data Augmentation with Paraphrase Generation and Entity Extraction for Multimodal Dialogue System [9.912419882236918]
We are working towards a multimodal dialogue system for younger kids learning basic math concepts. This work explores the potential benefits of data augmentation with paraphrase generation for the Natural Language Understanding module of the Spoken Dialogue Systems pipeline. We have shown that paraphrasing with model-in-the-loop (MITL) strategies using small seed data is a promising approach yielding improved performance results for the Intent Recognition task.
arXiv Detail & Related papers (2022-05-09T02:21:20Z)
Low-Resource Knowledge-Grounded Dialogue Generation [74.09352261943913]
We consider knowledge-grounded dialogue generation under a natural assumption that only limited training examples are available. We devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model. With only 1/8 training data, our model can achieve the state-of-the-art performance and generalize well on out-of-domain knowledge.
arXiv Detail & Related papers (2020-02-24T16:20:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.