TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework
for Deep Learning with Anonymized Intermediate Representations
- URL: http://arxiv.org/abs/2005.11480v7
- Date: Tue, 25 Aug 2020 01:36:06 GMT
- Title: TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework
for Deep Learning with Anonymized Intermediate Representations
- Authors: Ang Li, Yixiao Duan, Huanrui Yang, Yiran Chen, Jianlei Yang
- Abstract summary: We present TIPRDC, a task-independent privacy-respecting data crowdsourcing framework with anonymized intermediate representation.
The goal of this framework is to learn a feature extractor that can hide the privacy information from the intermediate representations; while maximally retaining the original information embedded in the raw data for the data collector to accomplish unknown learning tasks.
- Score: 49.20701800683092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of deep learning partially benefits from the availability of
various large-scale datasets. These datasets are often crowdsourced from
individual users and contain private information like gender, age, etc. The
emerging privacy concerns from users on data sharing hinder the generation or
use of crowdsourcing datasets and lead to hunger of training data for new deep
learning applications. One na\"{\i}ve solution is to pre-process the raw data
to extract features at the user-side, and then only the extracted features will
be sent to the data collector. Unfortunately, attackers can still exploit these
extracted features to train an adversary classifier to infer private
attributes. Some prior arts leveraged game theory to protect private
attributes. However, these defenses are designed for known primary learning
tasks, the extracted features work poorly for unknown learning tasks. To tackle
the case where the learning task may be unknown or changing, we present TIPRDC,
a task-independent privacy-respecting data crowdsourcing framework with
anonymized intermediate representation. The goal of this framework is to learn
a feature extractor that can hide the privacy information from the intermediate
representations; while maximally retaining the original information embedded in
the raw data for the data collector to accomplish unknown learning tasks. We
design a hybrid training method to learn the anonymized intermediate
representation: (1) an adversarial training process for hiding private
information from features; (2) maximally retain original information using a
neural-network-based mutual information estimator.
Related papers
- Federated Face Forgery Detection Learning with Personalized Representation [63.90408023506508]
Deep generator technology can produce high-quality fake videos that are indistinguishable, posing a serious social threat.
Traditional forgery detection methods directly centralized training on data.
The paper proposes a novel federated face forgery detection learning with personalized representation.
arXiv Detail & Related papers (2024-06-17T02:20:30Z) - Privacy-Preserving Graph Machine Learning from Data to Computation: A
Survey [67.7834898542701]
We focus on reviewing privacy-preserving techniques of graph machine learning.
We first review methods for generating privacy-preserving graph data.
Then we describe methods for transmitting privacy-preserved information.
arXiv Detail & Related papers (2023-07-10T04:30:23Z) - Free Lunch for Privacy Preserving Distributed Graph Learning [1.8292714902548342]
We present a novel privacy-respecting framework for distributed graph learning and graph-based machine learning.
This framework aims to learn features as well as distances without requiring actual features while preserving the original structural properties of the raw data.
arXiv Detail & Related papers (2023-05-18T10:41:21Z) - Privacy-Preserving Machine Learning for Collaborative Data Sharing via
Auto-encoder Latent Space Embeddings [57.45332961252628]
Privacy-preserving machine learning in data-sharing processes is an ever-critical task.
This paper presents an innovative framework that uses Representation Learning via autoencoders to generate privacy-preserving embedded data.
arXiv Detail & Related papers (2022-11-10T17:36:58Z) - Differentially Private Language Models for Secure Data Sharing [19.918137395199224]
In this paper, we show how to train a generative language model in a differentially private manner and consequently sampling data from it.
Using natural language prompts and a new prompt-mismatch loss, we are able to create highly accurate and fluent textual datasets.
We perform thorough experiments indicating that our synthetic datasets do not leak information from our original data and are of high language quality.
arXiv Detail & Related papers (2022-10-25T11:12:56Z) - SPEED: Secure, PrivatE, and Efficient Deep learning [2.283665431721732]
We introduce a deep learning framework able to deal with strong privacy constraints.
Based on collaborative learning, differential privacy and homomorphic encryption, the proposed approach advances state-of-the-art.
arXiv Detail & Related papers (2020-06-16T19:31:52Z) - Privacy Adversarial Network: Representation Learning for Mobile Data
Privacy [33.75500773909694]
A growing number of cloud-based intelligent services for mobile users require user data to be sent to the provider.
Prior works either obfuscate the data, e.g. add noise and remove identity information, or send representations extracted from the data, e.g. anonymized features.
This work departs from prior works in methodology: we leverage adversarial learning to a better balance between privacy and utility.
arXiv Detail & Related papers (2020-06-08T09:42:04Z) - Decentralised Learning from Independent Multi-Domain Labels for Person
Re-Identification [69.29602103582782]
Deep learning has been successful for many computer vision tasks due to the availability of shared and centralised large-scale training data.
However, increasing awareness of privacy concerns poses new challenges to deep learning, especially for person re-identification (Re-ID)
We propose a novel paradigm called Federated Person Re-Identification (FedReID) to construct a generalisable global model (a central server) by simultaneously learning with multiple privacy-preserved local models (local clients)
This client-server collaborative learning process is iteratively performed under privacy control, enabling FedReID to realise decentralised learning without sharing distributed data nor collecting any
arXiv Detail & Related papers (2020-06-07T13:32:33Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.