Related papers: A Validity Perspective on Evaluating the Justified Use of Data-driven Decision-making Algorithms

A Validity Perspective on Evaluating the Justified Use of Data-driven Decision-making Algorithms

URL: http://arxiv.org/abs/2206.14983v2
Date: Tue, 14 Feb 2023 15:24:44 GMT
Title: A Validity Perspective on Evaluating the Justified Use of Data-driven Decision-making Algorithms
Authors: Amanda Coston, Anna Kawakami, Haiyi Zhu, Ken Holstein, and Hoda Heidari
Abstract summary: We apply the lens of validity to re-examine challenges in problem formulation and data issues that jeopardize the justifiability of using predictive algorithms. We demonstrate how these validity considerations could distill into a series of high-level questions intended to promote and document reflections on the legitimacy of the predictive task and the suitability of the data.
Score: 14.96024118861361
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent research increasingly brings to question the appropriateness of using predictive tools in complex, real-world tasks. While a growing body of work has explored ways to improve value alignment in these tools, comparatively less work has centered concerns around the fundamental justifiability of using these tools. This work seeks to center validity considerations in deliberations around whether and how to build data-driven algorithms in high-stakes domains. Toward this end, we translate key concepts from validity theory to predictive algorithms. We apply the lens of validity to re-examine common challenges in problem formulation and data issues that jeopardize the justifiability of using predictive algorithms and connect these challenges to the social science discourse around validity. Our interdisciplinary exposition clarifies how these concepts apply to algorithmic decision making contexts. We demonstrate how these validity considerations could distill into a series of high-level questions intended to promote and document reflections on the legitimacy of the predictive task and the suitability of the data.

Related papers

Adaptive Bounded Exploration and Intermediate Actions for Data Debiasing [18.87576995391638]
We propose algorithms for sequentially debiasing the training dataset through adaptive and bounded exploration. Our proposed algorithms balance between the ultimate goal of mitigating the impacts of data biases -- which will in turn lead to more accurate and fairer decisions.
arXiv Detail & Related papers (2025-04-10T22:22:23Z)
Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1) [66.51642638034822]
Reasoning is central to human intelligence, enabling structured problem-solving across diverse tasks. Recent advances in large language models (LLMs) have greatly enhanced their reasoning abilities in arithmetic, commonsense, and symbolic domains. This paper offers a concise yet insightful overview of reasoning techniques in both textual and multimodal LLMs.
arXiv Detail & Related papers (2025-04-04T04:04:56Z)
Do We Need to Verify Step by Step? Rethinking Process Supervision from a Theoretical Perspective [59.61868506896214]
We show that under standard data coverage assumptions, reinforcement learning is no more statistically difficult than through process supervision. We prove that any policy's advantage function can serve as an optimal process reward model.
arXiv Detail & Related papers (2025-02-14T22:21:56Z)
Optimisation Strategies for Ensuring Fairness in Machine Learning: With and Without Demographics [4.662958544712181]
This paper introduces two formal frameworks to tackle open questions in machine learning fairness. In one framework, operator-valued optimisation and min-max objectives are employed to address unfairness in time-series problems. In the second framework, the challenge of lacking sensitive attributes, such as gender and race, in commonly used datasets is addressed.
arXiv Detail & Related papers (2024-11-13T22:29:23Z)
Towards Explainable Automated Data Quality Enhancement without Domain Knowledge [0.0]
We propose a comprehensive framework designed to automatically assess and rectify data quality issues in any given dataset. Our primary objective is to address three fundamental types of defects: absence, redundancy, and incoherence. We adopt a hybrid approach that integrates statistical methods with machine learning algorithms.
arXiv Detail & Related papers (2024-09-16T10:08:05Z)
Absolute Ranking: An Essential Normalization for Benchmarking Optimization Algorithms [0.0]
evaluating performance across optimization algorithms on many problems presents a complex challenge due to the diversity of numerical scales involved. This paper extensively explores the problem, making a compelling case to underscore the issue and conducting a thorough analysis of its root causes. Building on this research, this paper introduces a new mathematical model called "absolute ranking" and a sampling-based computational method.
arXiv Detail & Related papers (2024-09-06T00:55:03Z)
Interpretable Clustering: A Survey [1.5641228378135836]
Clustering algorithms are increasingly being applied in high-stakes domains such as healthcare, finance, and autonomous systems. The need for transparent and interpretable clustering outcomes has become a critical concern. This paper provides a comprehensive and structured review of the current state of explainable clustering algorithms.
arXiv Detail & Related papers (2024-09-01T15:09:51Z)
Deep Learning-Based Object Pose Estimation: A Comprehensive Survey [73.74933379151419]
We discuss the recent advances in deep learning-based object pose estimation. Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks.
arXiv Detail & Related papers (2024-05-13T14:44:22Z)
Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data. Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box. In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z)
A Dataset for the Validation of Truth Inference Algorithms Suitable for Online Deployment [76.04306818209753]
We introduce a substantial crowdsourcing annotation dataset collected from a real-world crowdsourcing platform. This dataset comprises approximately two thousand workers, one million tasks, and six million annotations. We evaluate the effectiveness of several representative truth inference algorithms on this dataset.
arXiv Detail & Related papers (2024-03-10T16:00:41Z)
On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies. Machine and deep learning algorithms depend heavily on the data used during their development. We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z)
Joint Communication and Computation Framework for Goal-Oriented Semantic Communication with Distortion Rate Resilience [13.36706909571975]
We use the rate-distortion theory to analyze distortions induced by communication and semantic compression. We can preemptively estimate the empirical accuracy of AI tasks, making the goal-oriented semantic communication problem feasible.
arXiv Detail & Related papers (2023-09-26T00:26:29Z)
Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning. We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class. We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.