Privacy-Preserving Online Content Moderation: A Federated Learning Use
Case
- URL: http://arxiv.org/abs/2209.11843v1
- Date: Fri, 23 Sep 2022 20:12:18 GMT
- Title: Privacy-Preserving Online Content Moderation: A Federated Learning Use
Case
- Authors: Pantelitsa Leonidou, Nicolas Kourtellis, Nikos Salamanos, Michael
Sirivianos
- Abstract summary: Federated Learning (FL) is an ML paradigm where the training is performed locally on the users' devices.
We propose a privacy-preserving FL framework for online content moderation that incorporates Differential Privacy (DP)
We show that the proposed FL framework can be close to the centralized approach - for both the DP and non-DP FL versions.
- Score: 3.1925030748447747
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Users are daily exposed to a large volume of harmful content on various
social network platforms. One solution is developing online moderation tools
using Machine Learning techniques. However, the processing of user data by
online platforms requires compliance with privacy policies. Federated Learning
(FL) is an ML paradigm where the training is performed locally on the users'
devices. Although the FL framework complies, in theory, with the GDPR policies,
privacy leaks can still occur. For instance, an attacker accessing the final
trained model can successfully perform unwanted inference of the data belonging
to the users who participated in the training process. In this paper, we
propose a privacy-preserving FL framework for online content moderation that
incorporates Differential Privacy (DP). To demonstrate the feasibility of our
approach, we focus on detecting harmful content on Twitter - but the overall
concept can be generalized to other types of misbehavior. We simulate a text
classifier - in FL fashion - which can detect tweets with harmful content. We
show that the performance of the proposed FL framework can be close to the
centralized approach - for both the DP and non-DP FL versions. Moreover, it has
a high performance even if a small number of clients (each with a small number
of data points) are available for the FL training. When reducing the number of
clients (from 50 to 10) or the data points per client (from 1K to 0.1K), the
classifier can still achieve ~81% AUC. Furthermore, we extend the evaluation to
four other Twitter datasets that capture different types of user misbehavior
and still obtain a promising performance (61% - 80% AUC). Finally, we explore
the overhead on the users' devices during the FL training phase and show that
the local training does not introduce excessive CPU utilization and memory
consumption overhead.
Related papers
- Efficient Federated Unlearning under Plausible Deniability [1.795561427808824]
Machine unlearning addresses this by modifying the ML parameters in order to forget the influence of a specific data point on its weights.
Recent literature has highlighted that the contribution from data point(s) can be forged with some other data points in the dataset with probability close to one.
This paper introduces an efficient way to achieve federated unlearning, by employing a privacy model which allows the FL server to plausibly deny the client's participation.
arXiv Detail & Related papers (2024-10-13T18:08:24Z) - SoK: Challenges and Opportunities in Federated Unlearning [32.0365189539138]
This SoK paper aims to take a deep look at the emphfederated unlearning literature, with the goal of identifying research trends and challenges in this emerging field.
arXiv Detail & Related papers (2024-03-04T19:35:08Z) - Federated Unlearning for Human Activity Recognition [11.287645073129108]
We propose a lightweight machine unlearning method for refining the FL HAR model by selectively removing a portion of a client's training data.
Our method achieves unlearning accuracy comparable to textitretraining methods, resulting in speedups ranging from hundreds to thousands.
arXiv Detail & Related papers (2024-01-17T15:51:36Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - Federated Learning with Noisy User Feedback [26.798303045807508]
Federated learning (FL) has emerged as a method for training ML models on edge devices using sensitive user data.
We propose a strategy for training FL models using positive and negative user feedback.
We show that our method improves substantially over a self-training baseline, achieving performance closer to models trained with full supervision.
arXiv Detail & Related papers (2022-05-06T09:14:24Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Do Gradient Inversion Attacks Make Federated Learning Unsafe? [70.0231254112197]
Federated learning (FL) allows the collaborative training of AI models without needing to share raw data.
Recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data.
In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack.
arXiv Detail & Related papers (2022-02-14T18:33:12Z) - Attribute Inference Attack of Speech Emotion Recognition in Federated
Learning Settings [56.93025161787725]
Federated learning (FL) is a distributed machine learning paradigm that coordinates clients to train a model collaboratively without sharing local data.
We propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters.
We show that the attribute inference attack is achievable for SER systems trained using FL.
arXiv Detail & Related papers (2021-12-26T16:50:42Z) - Federated Robustness Propagation: Sharing Adversarial Robustness in
Federated Learning [98.05061014090913]
Federated learning (FL) emerges as a popular distributed learning schema that learns from a set of participating users without requiring raw data to be shared.
adversarial training (AT) provides a sound solution for centralized learning, extending its usage for FL users has imposed significant challenges.
We show that existing FL techniques cannot effectively propagate adversarial robustness among non-iid users.
We propose a simple yet effective propagation approach that transfers robustness through carefully designed batch-normalization statistics.
arXiv Detail & Related papers (2021-06-18T15:52:33Z) - WAFFLe: Weight Anonymized Factorization for Federated Learning [88.44939168851721]
In domains where data are sensitive or private, there is great value in methods that can learn in a distributed manner without the data ever leaving the local devices.
We propose Weight Anonymized Factorization for Federated Learning (WAFFLe), an approach that combines the Indian Buffet Process with a shared dictionary of weight factors for neural networks.
arXiv Detail & Related papers (2020-08-13T04:26:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.