The Curious Case of Arbitrariness in Machine Learning
- URL: http://arxiv.org/abs/2501.14959v1
- Date: Fri, 24 Jan 2025 22:45:09 GMT
- Title: The Curious Case of Arbitrariness in Machine Learning
- Authors: Prakhar Ganesh, Afaf Taik, Golnoosh Farnadi,
- Abstract summary: Algorithmic modelling relies on limited information in data to extrapolate outcomes for unseen scenarios, often embedding an element of arbitrariness in its decisions.
A perspective on this arbitrariness that has recently gained interest is multiplicity-the study of arbitrariness across a set of "good models"
We systemize the literature on multiplicity by: (a) formalizing the terminology around model design choices and their contribution to arbitrariness, (b) expanding the definition of multiplicity to incorporate underrepresented forms beyond just predictions and explanations, and (d) distilling the benefits and potential risks of multiplicity
- Score: 4.932130498861987
- License:
- Abstract: Algorithmic modelling relies on limited information in data to extrapolate outcomes for unseen scenarios, often embedding an element of arbitrariness in its decisions. A perspective on this arbitrariness that has recently gained interest is multiplicity-the study of arbitrariness across a set of "good models", i.e., those likely to be deployed in practice. In this work, we systemize the literature on multiplicity by: (a) formalizing the terminology around model design choices and their contribution to arbitrariness, (b) expanding the definition of multiplicity to incorporate underrepresented forms beyond just predictions and explanations, (c) clarifying the distinction between multiplicity and other traditional lenses of arbitrariness, i.e., uncertainty and variance, and (d) distilling the benefits and potential risks of multiplicity into overarching trends, situating it within the broader landscape of responsible AI. We conclude by identifying open research questions and highlighting emerging trends in this young but rapidly growing area of research.
Related papers
- DEMAU: Decompose, Explore, Model and Analyse Uncertainties [0.8287206589886881]
DEMAU is an open-source educational, exploratory and analytical tool allowing to visualize and explore several types of uncertainty for classification models in machine learning.
arXiv Detail & Related papers (2024-09-12T14:57:28Z) - Finding Patterns in Ambiguity: Interpretable Stress Testing in the Decision~Boundary [3.66599628097164]
We propose a novel approach to enhance the interpretability of deep binary classifiers.
We select representative samples from the decision boundary and apply post-model explanation algorithms.
Our work contributes to the responsible development and deployment of reliable machine learning systems.
arXiv Detail & Related papers (2024-08-12T17:14:41Z) - Continual Learning of Nonlinear Independent Representations [17.65617189829692]
We show that model identifiability progresses from a subspace level to a component-wise level as the number of distributions increases.
Our method achieves performance comparable to nonlinear ICA methods trained jointly on multiple offline distributions.
arXiv Detail & Related papers (2024-08-11T14:33:37Z) - Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process.
vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner.
We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z) - The Cost of Arbitrariness for Individuals: Examining the Legal and Technical Challenges of Model Multiplicity [4.514832807541816]
This paper explores various individual concerns stemming from multiplicity, including the effects of arbitrariness beyond final predictions.
It provides both an empirical examination of these concerns and a comprehensive analysis from the legal standpoint, addressing how these issues are perceived in the anti-discrimination law in Canada.
We conclude the discussion with technical challenges in the current landscape of model multiplicity to meet legal requirements and the legal gap between current law and the implications of arbitrariness in model selection.
arXiv Detail & Related papers (2024-05-28T21:54:03Z) - Revealing Multimodal Contrastive Representation Learning through Latent
Partial Causal Models [85.67870425656368]
We introduce a unified causal model specifically designed for multimodal data.
We show that multimodal contrastive representation learning excels at identifying latent coupled variables.
Experiments demonstrate the robustness of our findings, even when the assumptions are violated.
arXiv Detail & Related papers (2024-02-09T07:18:06Z) - Multi-Target Multiplicity: Flexibility and Fairness in Target
Specification under Resource Constraints [76.84999501420938]
We introduce a conceptual and computational framework for assessing how the choice of target affects individuals' outcomes.
We show that the level of multiplicity that stems from target variable choice can be greater than that stemming from nearly-optimal models of a single target.
arXiv Detail & Related papers (2023-06-23T18:57:14Z) - Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning.
Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z) - Variational Inference for Deep Probabilistic Canonical Correlation
Analysis [49.36636239154184]
We propose a deep probabilistic multi-view model that is composed of a linear multi-view layer and deep generative networks as observation models.
An efficient variational inference procedure is developed that approximates the posterior distributions of the latent probabilistic multi-view layer.
A generalization to models with arbitrary number of views is also proposed.
arXiv Detail & Related papers (2020-03-09T17:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.