Privacy-Preserving Domain Adaptation of Semantic Parsers
- URL: http://arxiv.org/abs/2212.10520v3
- Date: Thu, 8 Jun 2023 18:03:43 GMT
- Title: Privacy-Preserving Domain Adaptation of Semantic Parsers
- Authors: Fatemehsadat Mireshghallah, Yu Su, Tatsunori Hashimoto, Jason Eisner,
Richard Shin
- Abstract summary: We propose a two-stage Differentially Private (DP) generation method which first generates latent semantic parses, and then generates utterances based on the parses.
Our proposed approach improves MAUVE by 2.5$times$ and parse tree function type overlap by 1.3$times$ relative to current approaches for private synthetic data generation.
- Score: 44.266262213139534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Task-oriented dialogue systems often assist users with personal or
confidential matters. For this reason, the developers of such a system are
generally prohibited from observing actual usage. So how can they know where
the system is failing and needs more training data or new functionality? In
this work, we study ways in which realistic user utterances can be generated
synthetically, to help increase the linguistic and functional coverage of the
system, without compromising the privacy of actual users. To this end, we
propose a two-stage Differentially Private (DP) generation method which first
generates latent semantic parses, and then generates utterances based on the
parses. Our proposed approach improves MAUVE by 2.5$\times$ and parse tree
function type overlap by 1.3$\times$ relative to current approaches for private
synthetic data generation, improving both on fluency and semantic coverage. We
further validate our approach on a realistic domain adaptation task of adding
new functionality from private user data to a semantic parser, and show overall
gains of 8.5% points in accuracy with the new feature.
Related papers
- IDT: Dual-Task Adversarial Attacks for Privacy Protection [8.312362092693377]
Methods to protect privacy can involve using representations inside models that are not to detect sensitive attributes.
We propose IDT, a method that analyses predictions made by auxiliary and interpretable models to identify which tokens are important to change.
We evaluate different datasets for NLP suitable for different tasks.
arXiv Detail & Related papers (2024-06-28T04:14:35Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Learn What You Need in Personalized Federated Learning [53.83081622573734]
$textitLearn2pFed$ is a novel algorithm-unrolling-based personalized federated learning framework.
We show that $textitLearn2pFed$ significantly outperforms previous personalized federated learning methods.
arXiv Detail & Related papers (2024-01-16T12:45:15Z) - When approximate design for fast homomorphic computation provides
differential privacy guarantees [0.08399688944263842]
Differential privacy (DP) and cryptographic primitives are popular countermeasures against privacy attacks.
In this paper, we design SHIELD, a probabilistic approximation algorithm for the argmax operator.
Even if SHIELD could have other applications, we here focus on one setting and seamlessly integrate it in the SPEED collaborative training framework.
arXiv Detail & Related papers (2023-04-06T09:38:01Z) - FedPC: Federated Learning for Language Generation with Personal and
Context Preference Embeddings [10.235620939242505]
Federated learning is a training paradigm that learns from multiple distributed users without aggregating data on a centralized server.
We propose a new direction for personalization research within federated learning, leveraging both personal embeddings and shared context embeddings.
We present an approach to predict these preference'' embeddings, enabling personalization without backpropagation.
arXiv Detail & Related papers (2022-10-07T18:01:19Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - On The Ingredients of an Effective Zero-shot Semantic Parser [95.01623036661468]
We analyze zero-shot learning by paraphrasing training examples of canonical utterances and programs from a grammar.
We propose bridging these gaps using improved grammars, stronger paraphrasers, and efficient learning methods.
Our model achieves strong performance on two semantic parsing benchmarks (Scholar, Geo) with zero labeled data.
arXiv Detail & Related papers (2021-10-15T21:41:16Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - An Imitation Game for Learning Semantic Parsers from User Interaction [43.66945504686796]
We suggest an alternative, human-in-the-loop methodology for learning semantic annotations directly from users.
A semantic should be introspective and prompt for user demonstration when uncertain.
In doing so it also gets to imitate the user behavior and continue improving itself autonomously.
arXiv Detail & Related papers (2020-05-02T03:30:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.