Related papers: Fundamental Limits and Tradeoffs in Invariant Representation Learning

Fundamental Limits and Tradeoffs in Invariant Representation Learning

URL: http://arxiv.org/abs/2012.10713v1
Date: Sat, 19 Dec 2020 15:24:04 GMT
Title: Fundamental Limits and Tradeoffs in Invariant Representation Learning
Authors: Han Zhao, Chen Dan, Bryon Aragam, Tommi S. Jaakkola, Geoffrey J. Gordon, Pradeep Ravikumar
Abstract summary: Many machine learning applications involve learning representations that achieve two competing goals. Minimax game-theoretic formulation represents a fundamental tradeoff between accuracy and invariance. We provide an information-theoretic analysis of this general and important problem under both classification and regression settings.
Score: 99.2368462915979
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many machine learning applications involve learning representations that achieve two competing goals: To maximize information or accuracy with respect to a subset of features (e.g.\ for prediction) while simultaneously maximizing invariance or independence with respect to another, potentially overlapping, subset of features (e.g.\ for fairness, privacy, etc). Typical examples include privacy-preserving learning, domain adaptation, and algorithmic fairness, just to name a few. In fact, all of the above problems admit a common minimax game-theoretic formulation, whose equilibrium represents a fundamental tradeoff between accuracy and invariance. Despite its abundant applications in the aforementioned domains, theoretical understanding on the limits and tradeoffs of invariant representations is severely lacking. In this paper, we provide an information-theoretic analysis of this general and important problem under both classification and regression settings. In both cases, we analyze the inherent tradeoffs between accuracy and invariance by providing a geometric characterization of the feasible region in the information plane, where we connect the geometric properties of this feasible region to the fundamental limitations of the tradeoff problem. In the regression setting, we also derive a tight lower bound on the Lagrangian objective that quantifies the tradeoff between accuracy and invariance. This lower bound leads to a better understanding of the tradeoff via the spectral properties of the joint distribution. In both cases, our results shed new light on this fundamental problem by providing insights on the interplay between accuracy and invariance. These results deepen our understanding of this fundamental problem and may be useful in guiding the design of adversarial representation learning algorithms.

Related papers

Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems. We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z)
Generalization bounds for regression and classification on adaptive covering input domains [1.4141453107129398]
We focus on the generalization bound, which serves as an upper limit for the generalization error. In the case of classification tasks, we treat the target function as a one-hot, a piece-wise constant function, and employ 0/1 loss for error measurement.
arXiv Detail & Related papers (2024-07-29T05:40:08Z)
Domain Adaptation with Cauchy-Schwarz Divergence [39.36943882475589]
We introduce Cauchy-Schwarz divergence to the problem of unsupervised domain adaptation (UDA) The CS divergence offers a theoretically tighter generalization error bound than the popular Kullback-Leibler divergence. We show how the CS divergence can be conveniently used in both distance metric- or adversarial training-based UDA frameworks.
arXiv Detail & Related papers (2024-05-30T12:01:12Z)
Large Margin Discriminative Loss for Classification [3.3975558777609915]
We introduce a novel discriminative loss function with large margin in the context of Deep Learning. This loss boosts the discriminative power of neural networks represented by intra-class compactness and inter-class separability. We design a strategy called partial momentum updating that enjoys simultaneously stability and consistency in training.
arXiv Detail & Related papers (2024-05-28T18:10:45Z)
Demystifying Disagreement-on-the-Line in High Dimensions [34.103373453782744]
We develop a theoretical foundation for analyzing disagreement in high-dimensional random features regression. Experiments on CIFAR-10-C, Tiny ImageNet-C, and Camelyon17 are consistent with our theory and support the universality of the theoretical findings.
arXiv Detail & Related papers (2023-01-31T02:31:18Z)
Leveraging Relational Information for Learning Weakly Disentangled Representations [11.460692362624533]
Disentanglement is a difficult property to enforce in neural representations. We present an alternative view over learning (weakly) disentangled representations.
arXiv Detail & Related papers (2022-05-20T09:58:51Z)
Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data. Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes. Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z)
On the Fundamental Trade-offs in Learning Invariant Representations [7.868449549351487]
We identify and determine two fundamental trade-offs between utility and semantic dependence induced by the statistical dependencies between the data and its corresponding target and semantic attributes. We numerically quantify the trade-offs on representative problems and compare to the solutions achieved by baseline representation learning algorithms.
arXiv Detail & Related papers (2021-09-08T01:26:46Z)
Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning [99.34907092347733]
We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions. Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure. In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning.
arXiv Detail & Related papers (2021-06-28T00:38:54Z)
Accurate and Robust Feature Importance Estimation under Distribution Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method. We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z)
Precise Tradeoffs in Adversarial Training for Linear Regression [55.764306209771405]
We provide a precise and comprehensive understanding of the role of adversarial training in the context of linear regression with Gaussian features. We precisely characterize the standard/robust accuracy and the corresponding tradeoff achieved by a contemporary mini-max adversarial training approach. Our theory for adversarial training algorithms also facilitates the rigorous study of how a variety of factors (size and quality of training data, model overparametrization etc.) affect the tradeoff between these two competing accuracies.
arXiv Detail & Related papers (2020-02-24T19:01:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.