Related papers: Towards Differential Relational Privacy and its use in Question Answering

Towards Differential Relational Privacy and its use in Question Answering

URL: http://arxiv.org/abs/2203.16701v1
Date: Wed, 30 Mar 2022 22:59:24 GMT
Title: Towards Differential Relational Privacy and its use in Question Answering
Authors: Simone Bombari, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, Stefano Soatto
Abstract summary: Memorization of relation between entities in a dataset can lead to privacy issues when using a trained question answering model. We quantify this phenomenon and provide a possible definition of Differential Privacy (DPRP) We illustrate concepts in experiments with largescale models for Question Answering.
Score: 109.4452196071872
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Memorization of the relation between entities in a dataset can lead to privacy issues when using a trained model for question answering. We introduce Relational Memorization (RM) to understand, quantify and control this phenomenon. While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning. The difference is most pronounced when the data distribution is long-tailed, with many queries having only few training examples: Impeding general memorization prevents effective learning, while impeding only relational memorization still allows learning general properties of the underlying concepts. We formalize the notion of Relational Privacy (RP) and, inspired by Differential Privacy (DP), we provide a possible definition of Differential Relational Privacy (DrP). These notions can be used to describe and compute bounds on the amount of RM in a trained model. We illustrate Relational Privacy concepts in experiments with large-scale models for Question Answering.

Related papers

Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
Rethinking LLM Memorization through the Lens of Adversarial Compression [93.13830893086681]
Large language models (LLMs) trained on web-scale datasets raise substantial concerns regarding permissible data usage. One major question is whether these models "memorize" all their training data or they integrate many data sources in some way more akin to how a human would learn and synthesize information. We propose the Adversarial Compression Ratio (ACR) as a metric for assessing memorization in LLMs.
arXiv Detail & Related papers (2024-04-23T15:49:37Z)
Unveiling Privacy, Memorization, and Input Curvature Links [11.290935303784208]
Memorization is closely related to several concepts such as generalization, noisy learning, and privacy. Recent research has shown evidence linking input loss curvature (measured by the trace of the loss Hessian w.r.t inputs) and memorization. We extend our analysis to establish theoretical links between differential privacy, memorization, and input loss curvature.
arXiv Detail & Related papers (2024-02-28T22:02:10Z)
SoK: Memorisation in machine learning [5.563171090433323]
Quantifying the impact of individual data samples on machine learning models is an open research problem. In this work we unify a broad range of previous definitions and perspectives on memorisation in ML. We discuss their interplay with model generalisation and their implications of these phenomena on data privacy.
arXiv Detail & Related papers (2023-11-06T12:59:18Z)
Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness. A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results. We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z)
Exploring Memorization in Fine-tuned Language Models [53.52403444655213]
We conduct the first comprehensive analysis to explore language models' memorization during fine-tuning across tasks. Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks. We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution.
arXiv Detail & Related papers (2023-10-10T15:41:26Z)
Bounding Information Leakage in Machine Learning [26.64770573405079]
This paper investigates fundamental bounds on information leakage. We identify and bound the success rate of the worst-case membership inference attack. We derive bounds on the mutual information between the sensitive attributes and model parameters.
arXiv Detail & Related papers (2021-05-09T08:49:14Z)
Learning with Instance Bundles for Reading Comprehension [61.823444215188296]
We introduce new supervision techniques that compare question-answer scores across multiple related instances. Specifically, we normalize these scores across various neighborhoods of closely contrasting questions and/or answers. We empirically demonstrate the effectiveness of training with instance bundles on two datasets.
arXiv Detail & Related papers (2021-04-18T06:17:54Z)
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting [100.75479161884935]
We propose a novel training paradigm called Remembering for the Right Reasons (RRR) RRR stores visual model explanations for each example in the buffer and ensures the model has "the right reasons" for its predictions. We demonstrate how RRR can be easily added to any memory or regularization-based approach and results in reduced forgetting.
arXiv Detail & Related papers (2020-10-04T10:05:27Z)
Understanding Unintended Memorization in Federated Learning [5.32880378510767]
We show that different components of Federated Learning play an important role in reducing unintended memorization. We also show that training with a strong user-level differential privacy guarantee results in models that exhibit the least amount of unintended memorization.
arXiv Detail & Related papers (2020-06-12T22:10:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.