Related papers: Privacy-Preserving Online Content Moderation: A Federated Learning Use Case

Privacy-Preserving Online Content Moderation: A Federated Learning Use Case

URL: http://arxiv.org/abs/2209.11843v1
Date: Fri, 23 Sep 2022 20:12:18 GMT
Title: Privacy-Preserving Online Content Moderation: A Federated Learning Use Case
Authors: Pantelitsa Leonidou, Nicolas Kourtellis, Nikos Salamanos, Michael Sirivianos
Abstract summary: Federated Learning (FL) is an ML paradigm where the training is performed locally on the users' devices. We propose a privacy-preserving FL framework for online content moderation that incorporates Differential Privacy (DP) We show that the proposed FL framework can be close to the centralized approach - for both the DP and non-DP FL versions.
Score: 3.1925030748447747
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Users are daily exposed to a large volume of harmful content on various social network platforms. One solution is developing online moderation tools using Machine Learning techniques. However, the processing of user data by online platforms requires compliance with privacy policies. Federated Learning (FL) is an ML paradigm where the training is performed locally on the users' devices. Although the FL framework complies, in theory, with the GDPR policies, privacy leaks can still occur. For instance, an attacker accessing the final trained model can successfully perform unwanted inference of the data belonging to the users who participated in the training process. In this paper, we propose a privacy-preserving FL framework for online content moderation that incorporates Differential Privacy (DP). To demonstrate the feasibility of our approach, we focus on detecting harmful content on Twitter - but the overall concept can be generalized to other types of misbehavior. We simulate a text classifier - in FL fashion - which can detect tweets with harmful content. We show that the performance of the proposed FL framework can be close to the centralized approach - for both the DP and non-DP FL versions. Moreover, it has a high performance even if a small number of clients (each with a small number of data points) are available for the FL training. When reducing the number of clients (from 50 to 10) or the data points per client (from 1K to 0.1K), the classifier can still achieve ~81% AUC. Furthermore, we extend the evaluation to four other Twitter datasets that capture different types of user misbehavior and still obtain a promising performance (61% - 80% AUC). Finally, we explore the overhead on the users' devices during the FL training phase and show that the local training does not introduce excessive CPU utilization and memory consumption overhead.

Related papers

The More is not the Merrier: Investigating the Effect of Client Size on Federated Learning [1.6258045262919332]
Federated Learning (FL) has been introduced as a way to keep data local to clients while training a shared machine learning model. In this paper, we focus on the widely used FedAvg algorithm to explore the effect of the number of clients in FL. We propose a method called Knowledgeable Client Insertion (KCI) that introduces a very small number of knowledgeable clients to the MEC setting.
arXiv Detail & Related papers (2025-04-11T02:01:38Z)
Unlearning Clients, Features and Samples in Vertical Federated Learning [1.6124402884077915]
Vertical Learning (VFL) has received less attention from the research community. In this paper, we explore unlearning in VFL from three perspectives: unlearning clients, unlearning features, and unlearning samples. To unlearn clients and features we introduce VFU-KD which is based on knowledge distillation (KD) while to unlearn samples, VFU-GA is introduced which is based on gradient ascent.
arXiv Detail & Related papers (2025-01-23T14:10:02Z)
Efficient Federated Unlearning under Plausible Deniability [1.795561427808824]
Machine unlearning addresses this by modifying the ML parameters in order to forget the influence of a specific data point on its weights. Recent literature has highlighted that the contribution from data point(s) can be forged with some other data points in the dataset with probability close to one. This paper introduces an efficient way to achieve federated unlearning, by employing a privacy model which allows the FL server to plausibly deny the client's participation.
arXiv Detail & Related papers (2024-10-13T18:08:24Z)
SoK: Challenges and Opportunities in Federated Unlearning [32.0365189539138]
This SoK paper aims to take a deep look at the emphfederated unlearning literature, with the goal of identifying research trends and challenges in this emerging field.
arXiv Detail & Related papers (2024-03-04T19:35:08Z)
Federated Unlearning for Human Activity Recognition [11.287645073129108]
We propose a lightweight machine unlearning method for refining the FL HAR model by selectively removing a portion of a client's training data. Our method achieves unlearning accuracy comparable to textitretraining methods, resulting in speedups ranging from hundreds to thousands.
arXiv Detail & Related papers (2024-01-17T15:51:36Z)
Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device) In FL, each data holder trains a model locally and releases it to a central server for aggregation. In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation). In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z)
Federated Learning with Noisy User Feedback [26.798303045807508]
Federated learning (FL) has emerged as a method for training ML models on edge devices using sensitive user data. We propose a strategy for training FL models using positive and negative user feedback. We show that our method improves substantially over a self-training baseline, achieving performance closer to models trained with full supervision.
arXiv Detail & Related papers (2022-05-06T09:14:24Z)
Acceleration of Federated Learning with Alleviated Forgetting in Local Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy. We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage. Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
Do Gradient Inversion Attacks Make Federated Learning Unsafe? [70.0231254112197]
Federated learning (FL) allows the collaborative training of AI models without needing to share raw data. Recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data. In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack.
arXiv Detail & Related papers (2022-02-14T18:33:12Z)
Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings [56.93025161787725]
Federated learning (FL) is a distributed machine learning paradigm that coordinates clients to train a model collaboratively without sharing local data. We propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters. We show that the attribute inference attack is achievable for SER systems trained using FL.
arXiv Detail & Related papers (2021-12-26T16:50:42Z)
Federated Robustness Propagation: Sharing Adversarial Robustness in Federated Learning [98.05061014090913]
Federated learning (FL) emerges as a popular distributed learning schema that learns from a set of participating users without requiring raw data to be shared. adversarial training (AT) provides a sound solution for centralized learning, extending its usage for FL users has imposed significant challenges. We show that existing FL techniques cannot effectively propagate adversarial robustness among non-iid users. We propose a simple yet effective propagation approach that transfers robustness through carefully designed batch-normalization statistics.
arXiv Detail & Related papers (2021-06-18T15:52:33Z)
WAFFLe: Weight Anonymized Factorization for Federated Learning [88.44939168851721]
In domains where data are sensitive or private, there is great value in methods that can learn in a distributed manner without the data ever leaving the local devices. We propose Weight Anonymized Factorization for Federated Learning (WAFFLe), an approach that combines the Indian Buffet Process with a shared dictionary of weight factors for neural networks.
arXiv Detail & Related papers (2020-08-13T04:26:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.