Related papers: Making Translators Privacy-aware on the User's Side

Making Translators Privacy-aware on the User's Side

URL: http://arxiv.org/abs/2312.04068v1
Date: Thu, 7 Dec 2023 06:23:17 GMT
Title: Making Translators Privacy-aware on the User's Side
Authors: Ryoma Sato
Abstract summary: We propose PRISM to enable users of machine translation systems to preserve the privacy of data on their own initiative. PRISM adds privacy features without significantly compromising translation accuracy.
Score: 17.912507269030577
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose PRISM to enable users of machine translation systems to preserve the privacy of data on their own initiative. There is a growing demand to apply machine translation systems to data that require privacy protection. While several machine translation engines claim to prioritize privacy, the extent and specifics of such protection are largely ambiguous. First, there is often a lack of clarity on how and to what degree the data is protected. Even if service providers believe they have sufficient safeguards in place, sophisticated adversaries might still extract sensitive information. Second, vulnerabilities may exist outside of these protective measures, such as within communication channels, potentially leading to data leakage. As a result, users are hesitant to utilize machine translation engines for data demanding high levels of privacy protection, thereby missing out on their benefits. PRISM resolves this problem. Instead of relying on the translation service to keep data safe, PRISM provides the means to protect data on the user's side. This approach ensures that even machine translation engines with inadequate privacy measures can be used securely. For platforms already equipped with privacy safeguards, PRISM acts as an additional protection layer, reinforcing their security furthermore. PRISM adds these privacy features without significantly compromising translation accuracy. Our experiments demonstrate the effectiveness of PRISM using real-world translators, T5 and ChatGPT (GPT-3.5-turbo), and the datasets with two languages. PRISM effectively balances privacy protection with translation accuracy.

Related papers

Token-Level Privacy in Large Language Models [7.4143291213663955]
We introduce dchi-stencil, a novel token-level privacy-preserving mechanism that integrates contextual and semantic information. By incorporating both semantic and contextual nuances, dchi-stencil achieves a robust balance between privacy and utility. This work highlights the potential of dchi-stencil to set a new standard for privacy-preserving NLP in modern, high-risk applications.
arXiv Detail & Related papers (2025-03-05T16:27:25Z)
Activity Recognition on Avatar-Anonymized Datasets with Masked Differential Privacy [64.32494202656801]
Privacy-preserving computer vision is an important emerging problem in machine learning and artificial intelligence. We present anonymization pipeline that replaces sensitive human subjects in video datasets with synthetic avatars within context. We also proposeMaskDP to protect non-anonymized but privacy sensitive background information.
arXiv Detail & Related papers (2024-10-22T15:22:53Z)
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories. We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z)
Privacy-Preserving Data Management using Blockchains [0.0]
Data providers need to control and update existing privacy preferences due to changing data usage. This paper proposes a blockchain-based methodology for preserving data providers private and sensitive data.
arXiv Detail & Related papers (2024-08-21T01:10:39Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human [56.46355425175232]
We suggest sanitizing sensitive text using two common strategies used by humans.<n>We curate the first corpus, coined NAP2, through both crowdsourcing and the use of large language models.<n>Compared to the prior works on anonymization, the human-inspired approaches result in more natural rewrites.
arXiv Detail & Related papers (2024-06-06T05:07:44Z)
PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration [18.11846784025521]
PrivacyRestore is a plug-and-play method to protect the privacy of user inputs during inference. We create three datasets, covering medical and legal domains, to evaluate the effectiveness of PrivacyRestore.
arXiv Detail & Related papers (2024-06-03T14:57:39Z)
InferDPT: Privacy-Preserving Inference for Black-box Large Language Model [66.07752875835506]
InferDPT is the first practical framework for the privacy-preserving Inference of black-box LLMs. RANTEXT is a novel differential privacy mechanism integrated into the perturbation module of InferDPT.
arXiv Detail & Related papers (2023-10-18T18:00:11Z)
PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind) Our work offers a theoretical analysis for model design and benchmarks various techniques. In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z)
Can Language Models be Instructed to Protect Personal Information? [30.187731765653428]
We introduce PrivQA -- a benchmark to assess the privacy/utility trade-off when a model is instructed to protect specific categories of personal information in a simulated scenario. We find that adversaries can easily circumvent these protections with simple jailbreaking methods through textual and/or image inputs. We believe PrivQA has the potential to support the development of new models with improved privacy protections, as well as the adversarial robustness of these protections.
arXiv Detail & Related papers (2023-10-03T17:30:33Z)
Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining [75.25943383604266]
We question whether the use of large Web-scraped datasets should be viewed as differential-privacy-preserving. We caution that publicizing these models pretrained on Web data as "private" could lead to harm and erode the public's trust in differential privacy as a meaningful definition of privacy. We conclude by discussing potential paths forward for the field of private learning, as public pretraining becomes more popular and powerful.
arXiv Detail & Related papers (2022-12-13T10:41:12Z)
Privacy Explanations - A Means to End-User Trust [64.7066037969487]
We looked into how explainability might help to tackle this problem. We created privacy explanations that aim to help to clarify to end users why and for what purposes specific data is required. Our findings reveal that privacy explanations can be an important step towards increasing trust in software systems.
arXiv Detail & Related papers (2022-10-18T09:30:37Z)
An Example of Privacy and Data Protection Best Practices for Biometrics Data Processing in Border Control: Lesson Learned from SMILE [0.9442139459221784]
Misuse of data, compromising the privacy of individuals and/or authorized processing of data may be irreversible. This is partly due to the lack of methods and guidance for the integration of data protection and privacy by design in the system development process. We present an example of privacy and data protection best practices to provide more guidance for data controllers and developers.
arXiv Detail & Related papers (2022-01-10T15:34:43Z)
Privacy in Open Search: A Review of Challenges and Solutions [0.6445605125467572]
Information retrieval (IR) is prone to privacy threats, such as attacks and unintended disclosures of documents and search history. This work aims at highlighting and discussing open challenges for privacy in the recent literature of IR, focusing on tasks featuring user-generated text data.
arXiv Detail & Related papers (2021-10-20T18:38:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.