One Size Fits All? A Modular Adaptive Sanitization Kit (MASK) for Customizable Privacy-Preserving Phone Scam Detection
- URL: http://arxiv.org/abs/2510.18493v1
- Date: Tue, 21 Oct 2025 10:30:36 GMT
- Title: One Size Fits All? A Modular Adaptive Sanitization Kit (MASK) for Customizable Privacy-Preserving Phone Scam Detection
- Authors: Kangzhong Wang, Zitong Shen, Youqian Zhang, Michael MK Cheung, Xiapu Luo, Grace Ngai, Eugene Yujun Fu,
- Abstract summary: Phone scams remain a pervasive threat to both personal safety and financial security worldwide.<n>Recent advances in large language models (LLMs) have demonstrated strong potential in detecting fraudulent behavior by analyzing transcribed phone conversations.<n>LLMs introduce notable privacy risks, as such conversations frequently contain sensitive personal information that may be exposed to third-party service providers during processing.<n>We propose MASK, a trainable and framework that enables dynamic privacy adjustment based on individual preferences.
- Score: 32.85893058597231
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Phone scams remain a pervasive threat to both personal safety and financial security worldwide. Recent advances in large language models (LLMs) have demonstrated strong potential in detecting fraudulent behavior by analyzing transcribed phone conversations. However, these capabilities introduce notable privacy risks, as such conversations frequently contain sensitive personal information that may be exposed to third-party service providers during processing. In this work, we explore how to harness LLMs for phone scam detection while preserving user privacy. We propose MASK (Modular Adaptive Sanitization Kit), a trainable and extensible framework that enables dynamic privacy adjustment based on individual preferences. MASK provides a pluggable architecture that accommodates diverse sanitization methods - from traditional keyword-based techniques for high-privacy users to sophisticated neural approaches for those prioritizing accuracy. We also discuss potential modeling approaches and loss function designs for future development, enabling the creation of truly personalized, privacy-aware LLM-based detection systems that balance user trust and detection effectiveness, even beyond phone scam context.
Related papers
- VoxPrivacy: A Benchmark for Evaluating Interactional Privacy of Speech Language Models [25.266028200777317]
Speech Language Models (SLMs) are expected to distinguish between users to manage information flow appropriately.<n>Current SLM benchmarks test dialogue ability but overlook speaker identity.<n>We introduce VoxPrivacy, the first benchmark designed to evaluate interactional privacy in SLMs.
arXiv Detail & Related papers (2026-01-27T06:22:14Z) - Evaluating Language Model Reasoning about Confidential Information [95.64687778185703]
We study whether language models exhibit contextual robustness, or the capability to adhere to context-dependent safety specifications.<n>We develop a benchmark (PasswordEval) that measures whether language models can correctly determine when a user request is authorized.<n>We find that current open- and closed-source models struggle with this seemingly simple task, and that, perhaps surprisingly, reasoning capabilities do not generally improve performance.
arXiv Detail & Related papers (2025-08-27T15:39:46Z) - Guarding Your Conversations: Privacy Gatekeepers for Secure Interactions with Cloud-Based AI Models [0.34998703934432673]
We propose the concept of an "LLM gatekeeper", a lightweight, locally run model that filters out sensitive information from user queries before they are sent to the potentially untrustworthy, though highly capable, cloud-based LLM.<n>Through experiments with human subjects, we demonstrate that this dual-model approach introduces minimal overhead while significantly enhancing user privacy, without compromising the quality of LLM responses.
arXiv Detail & Related papers (2025-08-22T19:49:03Z) - A Framework to Prevent Biometric Data Leakage in the Immersive Technologies Domain [11.378592130605929]
The psychography data (such as voice command features, facial dynamics, etc.) is sensitive data and it should not be leaked out of the device without users consent.<n>We develop a simple technical framework to mitigate sensitive data (or biometric data) privacy leaks in immersive technology domain.
arXiv Detail & Related papers (2025-05-07T04:35:32Z) - Communication-Efficient and Privacy-Adaptable Mechanism for Federated Learning [54.20871516148981]
We introduce the Communication-Efficient and Privacy-Adaptable Mechanism (CEPAM)<n>CEPAM achieves communication efficiency and privacy protection simultaneously.<n>We theoretically analyze the privacy guarantee of CEPAM and investigate the trade-offs among user privacy and accuracy of CEPAM.
arXiv Detail & Related papers (2025-01-21T11:16:05Z) - PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.<n>We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.<n>State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions [12.451936012379319]
Large Language Models (LLMs) represent a significant advancement in artificial intelligence, finding applications across various domains.<n>Their reliance on massive internet-sourced datasets for training brings notable privacy issues.<n>Certain application-specific scenarios may require fine-tuning these models on private data.
arXiv Detail & Related papers (2024-08-10T05:41:19Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Can Language Models be Instructed to Protect Personal Information? [30.187731765653428]
We introduce PrivQA -- a benchmark to assess the privacy/utility trade-off when a model is instructed to protect specific categories of personal information in a simulated scenario.
We find that adversaries can easily circumvent these protections with simple jailbreaking methods through textual and/or image inputs.
We believe PrivQA has the potential to support the development of new models with improved privacy protections, as well as the adversarial robustness of these protections.
arXiv Detail & Related papers (2023-10-03T17:30:33Z) - OPOM: Customized Invisible Cloak towards Face Privacy Protection [58.07786010689529]
We investigate the face privacy protection from a technology standpoint based on a new type of customized cloak.
We propose a new method, named one person one mask (OPOM), to generate person-specific (class-wise) universal masks.
The effectiveness of the proposed method is evaluated on both common and celebrity datasets.
arXiv Detail & Related papers (2022-05-24T11:29:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.