On the Privacy-Utility Tradeoff in Peer-Review Data Analysis
- URL: http://arxiv.org/abs/2006.16385v1
- Date: Mon, 29 Jun 2020 21:08:21 GMT
- Title: On the Privacy-Utility Tradeoff in Peer-Review Data Analysis
- Authors: Wenxin Ding, Nihar B. Shah, Weina Wang
- Abstract summary: A major impediment to research on improving peer review is the unavailability of peer-review data.
We propose a framework for privacy-preserving release of certain conference peer-review data.
- Score: 34.0435377376779
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A major impediment to research on improving peer review is the unavailability
of peer-review data, since any release of such data must grapple with the
sensitivity of the peer review data in terms of protecting identities of
reviewers from authors. We posit the need to develop techniques to release
peer-review data in a privacy-preserving manner. Identifying this problem, in
this paper we propose a framework for privacy-preserving release of certain
conference peer-review data -- distributions of ratings, miscalibration, and
subjectivity -- with an emphasis on the accuracy (or utility) of the released
data. The crux of the framework lies in recognizing that a part of the data
pertaining to the reviews is already available in public, and we use this
information to post-process the data released by any privacy mechanism in a
manner that improves the accuracy (utility) of the data while retaining the
privacy guarantees. Our framework works with any privacy-preserving mechanism
that operates via releasing perturbed data. We present several positive and
negative theoretical results, including a polynomial-time algorithm for
improving on the privacy-utility tradeoff.
Related papers
- Synthetic Data: Revisiting the Privacy-Utility Trade-off [4.832355454351479]
An article stated that synthetic data does not provide a better trade-off between privacy and utility than traditional anonymization techniques.
The article also claims to have identified a breach in the differential privacy guarantees provided by PATEGAN and PrivBayes.
We analyzed the implementation of the privacy game described in the article and found that it operated in a highly specialized and constrained environment.
arXiv Detail & Related papers (2024-07-09T14:48:43Z) - A Summary of Privacy-Preserving Data Publishing in the Local Setting [0.6749750044497732]
Statistical Disclosure Control aims to minimize the risk of exposing confidential information by de-identifying it.
We outline the current privacy-preserving techniques employed in microdata de-identification, delve into privacy measures tailored for various disclosure scenarios, and assess metrics for information loss and predictive performance.
arXiv Detail & Related papers (2023-12-19T04:23:23Z) - $\alpha$-Mutual Information: A Tunable Privacy Measure for Privacy
Protection in Data Sharing [4.475091558538915]
This paper adopts Arimoto's $alpha$-Mutual Information as a tunable privacy measure.
We formulate a general distortion-based mechanism that manipulates the original data to offer privacy protection.
arXiv Detail & Related papers (2023-10-27T16:26:14Z) - Auditing and Generating Synthetic Data with Controllable Trust Trade-offs [54.262044436203965]
We introduce a holistic auditing framework that comprehensively evaluates synthetic datasets and AI models.
It focuses on preventing bias and discrimination, ensures fidelity to the source data, assesses utility, robustness, and privacy preservation.
We demonstrate the framework's effectiveness by auditing various generative models across diverse use cases.
arXiv Detail & Related papers (2023-04-21T09:03:18Z) - A Randomized Approach for Tight Privacy Accounting [63.67296945525791]
We propose a new differential privacy paradigm called estimate-verify-release (EVR)
EVR paradigm first estimates the privacy parameter of a mechanism, then verifies whether it meets this guarantee, and finally releases the query output.
Our empirical evaluation shows the newly proposed EVR paradigm improves the utility-privacy tradeoff for privacy-preserving machine learning.
arXiv Detail & Related papers (2023-04-17T00:38:01Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - No Free Lunch in "Privacy for Free: How does Dataset Condensation Help
Privacy" [75.98836424725437]
New methods designed to preserve data privacy require careful scrutiny.
Failure to preserve privacy is hard to detect, and yet can lead to catastrophic results when a system implementing a privacy-preserving'' method is attacked.
arXiv Detail & Related papers (2022-09-29T17:50:23Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Yes-Yes-Yes: Donation-based Peer Reviewing Data Collection for ACL
Rolling Review and Beyond [58.71736531356398]
We present an in-depth discussion of peer reviewing data, outline the ethical and legal desiderata for peer reviewing data collection, and propose the first continuous, donation-based data collection workflow.
We report on the ongoing implementation of this workflow at the ACL Rolling Review and deliver the first insights obtained with the newly collected data.
arXiv Detail & Related papers (2022-01-27T11:02:43Z) - Causally Constrained Data Synthesis for Private Data Release [36.80484740314504]
Using synthetic data which reflects certain statistical properties of the original data preserves the privacy of the original data.
Prior works utilize differentially private data release mechanisms to provide formal privacy guarantees.
We propose incorporating causal information into the training process to favorably modify the aforementioned trade-off.
arXiv Detail & Related papers (2021-05-27T13:46:57Z) - A Critical Overview of Privacy-Preserving Approaches for Collaborative
Forecasting [0.0]
Cooperation between different data owners may lead to an improvement in forecast quality.
Due to business competitive factors and personal data protection questions, said data owners might be unwilling to share their data.
This paper analyses the state-of-the-art and unveils several shortcomings of existing methods in guaranteeing data privacy.
arXiv Detail & Related papers (2020-04-20T20:21:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.