Related papers: Metric-Fair Classifier Derandomization

Metric-Fair Classifier Derandomization

URL: http://arxiv.org/abs/2206.07826v2
Date: Fri, 17 Jun 2022 04:08:13 GMT
Title: Metric-Fair Classifier Derandomization
Authors: Jimmy Wu, Yatong Chen, Yang Liu
Abstract summary: We study the problem of classifier derandomization in machine learning. We show that the prior derandomization approach is almost maximally metric-unfair. We devise a derandomization procedure that provides an appealing tradeoff between these two.
Score: 6.269732593554894
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of classifier derandomization in machine learning: given a stochastic binary classifier $f: X \to [0,1]$, sample a deterministic classifier $\hat{f}: X \to \{0,1\}$ that approximates the output of $f$ in aggregate over any data distribution. Recent work revealed how to efficiently derandomize a stochastic classifier with strong output approximation guarantees, but at the cost of individual fairness -- that is, if $f$ treated similar inputs similarly, $\hat{f}$ did not. In this paper, we initiate a systematic study of classifier derandomization with metric fairness guarantees. We show that the prior derandomization approach is almost maximally metric-unfair, and that a simple ``random threshold'' derandomization achieves optimal fairness preservation but with weaker output approximation. We then devise a derandomization procedure that provides an appealing tradeoff between these two: if $f$ is $\alpha$-metric fair according to a metric $d$ with a locality-sensitive hash (LSH) family, then our derandomized $\hat{f}$ is, with high probability, $O(\alpha)$-metric fair and a close approximation of $f$. We also prove generic results applicable to all (fair and unfair) classifier derandomization procedures, including a bias-variance decomposition and reductions between various notions of metric fairness.

Related papers

The Role of Randomness in Stability [20.718747268949112]
We study the randomness complexity of two influential notions of stability in learning: replicability and differential privacy. We prove a weak-to-strong' boosting theorem for stability: the randomness complexity of a task is tightly controlled by the best replication probability.
arXiv Detail & Related papers (2025-02-11T23:06:43Z)
Rényi divergence-based uniformity guarantees for $k$-universal hash functions [59.90381090395222]
Universal hash functions map the output of a source to random strings over a finite alphabet. We show that it is possible to distill random bits that are nearly uniform, as measured by min-entropy.
arXiv Detail & Related papers (2024-10-21T19:37:35Z)
How to Shrink Confidence Sets for Many Equivalent Discrete Distributions? [17.52590726972374]
We exploit permutation-equivalence in machine learning problems. We show that the size of confidence sets shrinks at rates of $O/sqrtn_k)$ and $O/max_kin K n_k)$.
arXiv Detail & Related papers (2024-07-22T14:19:19Z)
SPLITZ: Certifiable Robustness via Split Lipschitz Randomized Smoothing [8.471466670802817]
There are two approaches to provide certifiable robustness to adversarial examples. We propose textitSPLITZ, a practical and novel approach. We show that textitSPLITZ consistently improves upon existing state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-03T05:13:28Z)
Near-Optimal Mean Estimation with Unknown, Heteroskedastic Variances [15.990720051907864]
Subset-of-Signals model serves as a benchmark for heteroskedastic mean estimation. Our algorithm resolves this open question up to logarithmic factors. Even for $d=2$, our techniques enable rates comparable to knowing the variance of each sample.
arXiv Detail & Related papers (2023-12-05T01:13:10Z)
Replicable Clustering [57.19013971737493]
We propose algorithms for the statistical $k$-medians, statistical $k$-means, and statistical $k$-centers problems by utilizing approximation routines for their counterparts in a black-box manner. We also provide experiments on synthetic distributions in 2D using the $k$-means++ implementation from sklearn as a black-box that validate our theoretical results.
arXiv Detail & Related papers (2023-02-20T23:29:43Z)
Bias Mimicking: A Simple Sampling Approach for Bias Mitigation [57.17709477668213]
We introduce a new class-conditioned sampling method: Bias Mimicking. Bias Mimicking improves underrepresented groups' accuracy of sampling methods by 3% over four benchmarks.
arXiv Detail & Related papers (2022-09-30T17:33:00Z)
Random quantum circuits transform local noise into global white noise [118.18170052022323]
We study the distribution over measurement outcomes of noisy random quantum circuits in the low-fidelity regime. For local noise that is sufficiently weak and unital, correlations (measured by the linear cross-entropy benchmark) between the output distribution $p_textnoisy$ of a generic noisy circuit instance shrink exponentially. If the noise is incoherent, the output distribution approaches the uniform distribution $p_textunif$ at precisely the same rate.
arXiv Detail & Related papers (2021-11-29T19:26:28Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$. We propose a novel random technique, called Ternary Random Feature (TRF) The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z)
On the robustness of randomized classifiers to adversarial examples [11.359085303200981]
We introduce a new notion of robustness for randomized classifiers, enforcing local Lipschitzness using probability metrics. We show that our results are applicable to a wide range of machine learning models under mild hypotheses. All robust models we trained models can simultaneously achieve state-of-the-art accuracy.
arXiv Detail & Related papers (2021-02-22T10:16:58Z)
Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution. We show that the constraint of local differential privacy incurs an exponential increase in cost. Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
Randomized Smoothing of All Shapes and Sizes [29.40896576138737]
We show that for an appropriate notion of "optimal", the optimal smoothing for any "nice" norms have level sets given by the norm's *Wulff Crystal* We show fundamental limits to current randomized smoothing techniques via the theory of *Banach space cotypes*.
arXiv Detail & Related papers (2020-02-19T11:41:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.