Related papers: LLM-as-a-Judge for Privacy Evaluation? Exploring the Alignment of Human and LLM Perceptions of Privacy in Textual Data

LLM-as-a-Judge for Privacy Evaluation? Exploring the Alignment of Human and LLM Perceptions of Privacy in Textual Data

URL: http://arxiv.org/abs/2508.12158v1
Date: Sat, 16 Aug 2025 20:49:41 GMT
Title: LLM-as-a-Judge for Privacy Evaluation? Exploring the Alignment of Human and LLM Perceptions of Privacy in Textual Data
Authors: Stephen Meisenbacher, Alexandra Klymenko, Florian Matthes,
Abstract summary: Despite advances in the field of privacy- Natural Language Processing (NLP), the accurate evaluation of privacy remains a challenge.<n>We present a global approach inspired by sox2013$ a model for privacy evaluation in textual data.<n>Our findings pave the way for exploring the feasibility of evaluators as privacy evaluators.
Score: 47.76073133338117
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite advances in the field of privacy-preserving Natural Language Processing (NLP), a significant challenge remains the accurate evaluation of privacy. As a potential solution, using LLMs as a privacy evaluator presents a promising approach $\unicode{x2013}$ a strategy inspired by its success in other subfields of NLP. In particular, the so-called $\textit{LLM-as-a-Judge}$ paradigm has achieved impressive results on a variety of natural language evaluation tasks, demonstrating high agreement rates with human annotators. Recognizing that privacy is both subjective and difficult to define, we investigate whether LLM-as-a-Judge can also be leveraged to evaluate the privacy sensitivity of textual data. Furthermore, we measure how closely LLM evaluations align with human perceptions of privacy in text. Resulting from a study involving 10 datasets, 13 LLMs, and 677 human survey participants, we confirm that privacy is indeed a difficult concept to measure empirically, exhibited by generally low inter-human agreement rates. Nevertheless, we find that LLMs can accurately model a global human privacy perspective, and through an analysis of human and LLM reasoning patterns, we discuss the merits and limitations of LLM-as-a-Judge for privacy evaluation in textual data. Our findings pave the way for exploring the feasibility of LLMs as privacy evaluators, addressing a core challenge in solving pressing privacy issues with innovative technical solutions.

Related papers

When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing [61.80513991207956]
This work focuses on the challenge of how to restore surrogate-driven protected data in diverse MLLM scenarios.<n>We first bridge this research gap by contributing the SPPE (Surrogate Privacy Protected Editable) dataset.<n>We introduce a unified approach that reliably reconstructs private content while preserving the fidelity of MLLM-generated edits.
arXiv Detail & Related papers (2025-12-08T04:59:03Z)
User Perceptions of Privacy and Helpfulness in LLM Responses to Privacy-Sensitive Scenarios [10.12906605142667]
We show how users perceive privacy-preservation quality and helpfulness of large language models responses to privacy-sensitive scenarios.<n>Our results suggest the need to conduct user-centered studies on measuring LLMs' ability to help users while preserving privacy.
arXiv Detail & Related papers (2025-10-23T16:38:26Z)
PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance [44.287734754038254]
We present PrivaCI-Bench, a contextual privacy evaluation benchmark for generative large language models (LLMs)<n>We evaluate the latest LLMs, including the recent reasoner models QwQ-32B and Deepseek R1.<n>Our experimental results suggest that though LLMs can effectively capture key CI parameters inside a given context, they still require further advancements for privacy compliance.
arXiv Detail & Related papers (2025-02-24T10:49:34Z)
Multi-P$^2$A: A Multi-perspective Benchmark on Privacy Assessment for Large Vision-Language Models [65.2761254581209]
We evaluate the privacy preservation capabilities of 21 open-source and 2 closed-source Large Vision-Language Models (LVLMs)<n>Based on Multi-P$2$A, we evaluate the privacy preservation capabilities of 21 open-source and 2 closed-source LVLMs.<n>Our results reveal that current LVLMs generally pose a high risk of facilitating privacy breaches.
arXiv Detail & Related papers (2024-12-27T07:33:39Z)
Investigating Privacy Bias in Training Data of Language Models [1.3167450470598043]
A privacy bias refers to the skew in the appropriateness of information flows within a given context.<n>This skew may either align with existing expectations or signal a symptom of systemic issues.<n>We present a novel approach to assess the privacy biases using a contextual integrity-based methodology.
arXiv Detail & Related papers (2024-09-05T17:50:31Z)
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.<n>We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.<n>State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z)
GoldCoin: Grounding Large Language Models in Privacy Laws via Contextual Integrity Theory [44.297102658873726]
Existing research studies privacy by exploring various privacy attacks, defenses, and evaluations within narrowly predefined patterns. We introduce a novel framework, GoldCoin, designed to efficiently ground LLMs in privacy laws for judicial assessing privacy violations. Our framework leverages the theory of contextual integrity as a bridge, creating numerous synthetic scenarios grounded in relevant privacy statutes.
arXiv Detail & Related papers (2024-06-17T02:27:32Z)
No Free Lunch Theorem for Privacy-Preserving LLM Inference [30.554456047738295]
This study develops a framework for inferring privacy-protected Large Language Models (LLMs)<n>It lays down a solid theoretical basis for examining the interplay between privacy preservation and utility.
arXiv Detail & Related papers (2024-05-31T08:22:53Z)
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.