Predictive Response Optimization: Using Reinforcement Learning to Fight Online Social Network Abuse
- URL: http://arxiv.org/abs/2502.17693v1
- Date: Mon, 24 Feb 2025 22:30:14 GMT
- Title: Predictive Response Optimization: Using Reinforcement Learning to Fight Online Social Network Abuse
- Authors: Garrett Wilson, Geoffrey Goh, Yan Jiang, Ajay Gupta, Jiaxuan Wang, David Freeman, Francesco Dinuzzo,
- Abstract summary: We argue that detection as described in previous work is not the goal of those who are fighting OSN abuse.<n>Rather, we believe the goal to be selecting actions that optimize a tradeoff between harm caused by abuse and impact on benign users.
- Score: 8.156427899556252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting phishing, spam, fake accounts, data scraping, and other malicious activity in online social networks (OSNs) is a problem that has been studied for well over a decade, with a number of important results. Nearly all existing works on abuse detection have as their goal producing the best possible binary classifier; i.e., one that labels unseen examples as "benign" or "malicious" with high precision and recall. However, no prior published work considers what comes next: what does the service actually do after it detects abuse? In this paper, we argue that detection as described in previous work is not the goal of those who are fighting OSN abuse. Rather, we believe the goal to be selecting actions (e.g., ban the user, block the request, show a CAPTCHA, or "collect more evidence") that optimize a tradeoff between harm caused by abuse and impact on benign users. With this framing, we see that enlarging the set of possible actions allows us to move the Pareto frontier in a way that is unattainable by simply tuning the threshold of a binary classifier. To demonstrate the potential of our approach, we present Predictive Response Optimization (PRO), a system based on reinforcement learning that utilizes available contextual information to predict future abuse and user-experience metrics conditioned on each possible action, and select actions that optimize a multi-dimensional tradeoff between abuse/harm and impact on user experience. We deployed versions of PRO targeted at stopping automated activity on Instagram and Facebook. In both cases our experiments showed that PRO outperforms a baseline classification system, reducing abuse volume by 59% and 4.5% (respectively) with no negative impact to users. We also present several case studies that demonstrate how PRO can quickly and automatically adapt to changes in business constraints, system behavior, and/or adversarial tactics.
Related papers
- PARIS: A Practical, Adaptive Trace-Fetching and Real-Time Malicious Behavior Detection System [6.068607290592521]
We propose adaptive trace fetching, lightweight, real-time malicious behavior detection system.
Specifically, we monitor malicious behavior with Event Tracing for Windows (ETW) and learn to selectively collect maliciousness-related APIs or call stacks.
As a result, we can monitor a wider range of APIs and detect more intricate attack behavior.
arXiv Detail & Related papers (2024-11-02T14:52:04Z) - Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations.<n>We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z) - Beyond Trial-and-Error: Predicting User Abandonment After a Moderation Intervention [0.6918368994425961]
We propose and tackle the novel task of predicting the effect of a moderation intervention on Reddit.
We use a dataset of 13.8M posts to compute a set of 142 features, which convey information about the activity, toxicity, relations, and writing style of the users.
Our results demonstrate the feasibility of predicting the effects of a moderation intervention, paving the way for a new research direction in predictive content moderation.
arXiv Detail & Related papers (2024-04-23T08:52:41Z) - Using Motion Forecasting for Behavior-Based Virtual Reality (VR)
Authentication [8.552737863305213]
We present the first approach that predicts future user behavior using Transformer-based forecasting and using the forecasted trajectory to perform user authentication.
Our approach reduces the authentication equal error rate (EER) by an average of 23.85% and a maximum reduction of 36.14%.
arXiv Detail & Related papers (2024-01-30T00:43:41Z) - SeGA: Preference-Aware Self-Contrastive Learning with Prompts for
Anomalous User Detection on Twitter [14.483830120541894]
We propose SeGA, preference-aware self-contrastive learning for anomalous user detection.
SeGA uses large language models to summarize user preferences via posts.
We empirically validate the effectiveness of the model design and pre-training strategies.
arXiv Detail & Related papers (2023-12-17T05:35:28Z) - Decoding the Silent Majority: Inducing Belief Augmented Social Graph
with Large Language Model for Response Forecasting [74.68371461260946]
SocialSense is a framework that induces a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics.
Our method surpasses existing state-of-the-art in experimental evaluations for both zero-shot and supervised settings.
arXiv Detail & Related papers (2023-10-20T06:17:02Z) - Online Corrupted User Detection and Regret Minimization [49.536254494829436]
In real-world online web systems, multiple users usually arrive sequentially into the system.
We present an important online learning problem named LOCUD to learn and utilize unknown user relations from disrupted behaviors.
We devise a novel online detection algorithm OCCUD based on RCLUB-WCU's inferred user relations.
arXiv Detail & Related papers (2023-10-07T10:20:26Z) - User-Centered Security in Natural Language Processing [0.7106986689736825]
dissertation proposes a framework of user-centered security in Natural Language Processing (NLP)
It focuses on two security domains within NLP with great public interest.
arXiv Detail & Related papers (2023-01-10T22:34:19Z) - Evaluating Machine Unlearning via Epistemic Uncertainty [78.27542864367821]
This work presents an evaluation of Machine Unlearning algorithms based on uncertainty.
This is the first definition of a general evaluation of our best knowledge.
arXiv Detail & Related papers (2022-08-23T09:37:31Z) - Meta-Wrapper: Differentiable Wrapping Operator for User Interest
Selection in CTR Prediction [97.99938802797377]
Click-through rate (CTR) prediction, whose goal is to predict the probability of the user to click on an item, has become increasingly significant in recommender systems.
Recent deep learning models with the ability to automatically extract the user interest from his/her behaviors have achieved great success.
We propose a novel approach under the framework of the wrapper method, which is named Meta-Wrapper.
arXiv Detail & Related papers (2022-06-28T03:28:15Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.