Privacy-preserving Deep Learning based Record Linkage
- URL: http://arxiv.org/abs/2211.02161v1
- Date: Thu, 3 Nov 2022 22:10:12 GMT
- Title: Privacy-preserving Deep Learning based Record Linkage
- Authors: Thilina Ranbaduge, Dinusha Vatsalan, Ming Ding
- Abstract summary: We propose the first deep learning-based multi-party privacy-preserving record linkage protocol.
In our approach, each database owner first trains a local deep learning model, which is then uploaded to a secure environment.
The global model is then used by a linkage unit to distinguish unlabelled record pairs as matches and non-matches.
- Score: 14.755422488889824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning-based linkage of records across different databases is becoming
increasingly useful in data integration and mining applications to discover new
insights from multiple sources of data. However, due to privacy and
confidentiality concerns, organisations often are not willing or allowed to
share their sensitive data with any external parties, thus making it
challenging to build/train deep learning models for record linkage across
different organizations' databases. To overcome this limitation, we propose the
first deep learning-based multi-party privacy-preserving record linkage (PPRL)
protocol that can be used to link sensitive databases held by multiple
different organisations. In our approach, each database owner first trains a
local deep learning model, which is then uploaded to a secure environment and
securely aggregated to create a global model. The global model is then used by
a linkage unit to distinguish unlabelled record pairs as matches and
non-matches. We utilise differential privacy to achieve provable privacy
protection against re-identification attacks. We evaluate the linkage quality
and scalability of our approach using several large real-world databases,
showing that it can achieve high linkage quality while providing sufficient
privacy protection against existing attacks.
Related papers
- Investigating Privacy Attacks in the Gray-Box Setting to Enhance Collaborative Learning Schemes [7.651569149118461]
We study privacy attacks in the gray-box setting, where the attacker has only limited access to the model.
We deploy SmartNNCrypt, a framework that tailors homomorphic encryption to protect the portions of the model posing higher privacy risks.
arXiv Detail & Related papers (2024-09-25T18:49:21Z) - Robust Utility-Preserving Text Anonymization Based on Large Language Models [80.5266278002083]
Text anonymization is crucial for sharing sensitive data while maintaining privacy.
Existing techniques face the emerging challenges of re-identification attack ability of Large Language Models.
This paper proposes a framework composed of three LLM-based components -- a privacy evaluator, a utility evaluator, and an optimization component.
arXiv Detail & Related papers (2024-07-16T14:28:56Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Federated Transfer Learning with Differential Privacy [21.50525027559563]
We formulate the notion of textitfederated differential privacy, which offers privacy guarantees for each data set without assuming a trusted central server.
We show that federated differential privacy is an intermediate privacy model between the well-established local and central models of differential privacy.
arXiv Detail & Related papers (2024-03-17T21:04:48Z) - FewFedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning [54.26614091429253]
Federated instruction tuning (FedIT) is a promising solution, by consolidating collaborative training across multiple data owners.
FedIT encounters limitations such as scarcity of instructional data and risk of exposure to training data extraction attacks.
We propose FewFedPIT, designed to simultaneously enhance privacy protection and model performance of federated few-shot learning.
arXiv Detail & Related papers (2024-03-10T08:41:22Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - SPEED: Secure, PrivatE, and Efficient Deep learning [2.283665431721732]
We introduce a deep learning framework able to deal with strong privacy constraints.
Based on collaborative learning, differential privacy and homomorphic encryption, the proposed approach advances state-of-the-art.
arXiv Detail & Related papers (2020-06-16T19:31:52Z) - Decentralised Learning from Independent Multi-Domain Labels for Person
Re-Identification [69.29602103582782]
Deep learning has been successful for many computer vision tasks due to the availability of shared and centralised large-scale training data.
However, increasing awareness of privacy concerns poses new challenges to deep learning, especially for person re-identification (Re-ID)
We propose a novel paradigm called Federated Person Re-Identification (FedReID) to construct a generalisable global model (a central server) by simultaneously learning with multiple privacy-preserved local models (local clients)
This client-server collaborative learning process is iteratively performed under privacy control, enabling FedReID to realise decentralised learning without sharing distributed data nor collecting any
arXiv Detail & Related papers (2020-06-07T13:32:33Z) - Secure Sum Outperforms Homomorphic Encryption in (Current) Collaborative
Deep Learning [7.690774882108066]
We discuss methods for training neural networks on the joint data of different data owners, that keep each party's input confidential.
We show that a less complex and computationally less expensive secure sum protocol exhibits superior properties in terms of both collusion-resistance and runtime.
arXiv Detail & Related papers (2020-06-02T23:03:32Z) - Federating Recommendations Using Differentially Private Prototypes [16.29544153550663]
We propose a new federated approach to learning global and local private models for recommendation without collecting raw data.
By requiring only two rounds of communication, we both reduce the communication costs and avoid the excessive privacy loss.
We show local adaptation of the global model allows our method to outperform centralized matrix-factorization-based recommender system models.
arXiv Detail & Related papers (2020-03-01T22:21:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.