Trustworthy AI Must Account for Interactions
- URL: http://arxiv.org/abs/2504.07170v2
- Date: Mon, 03 Nov 2025 01:42:55 GMT
- Title: Trustworthy AI Must Account for Interactions
- Authors: Jesse C. Cresswell,
- Abstract summary: Research on Trustworthy AI must account for interactions between aspects and adopt a holistic view across all relevant axes at once.<n>We provide guidance on how practitioners can work towards integrated trust, examples of how interactions affect the financial industry, and alternative views.
- Score: 11.322831855349422
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Trustworthy AI encompasses many aspirational aspects for aligning AI systems with human values, including fairness, privacy, robustness, explainability, and uncertainty quantification. Ultimately the goal of Trustworthy AI research is to achieve all aspects simultaneously. However, efforts to enhance one aspect often introduce unintended trade-offs that negatively impact others. In this position paper, we review notable approaches to these five aspects and systematically consider every pair, detailing the negative interactions that can arise. For example, applying differential privacy to model training can amplify biases, undermining fairness. Drawing on these findings, we take the position that current research practices of improving one or two aspects in isolation are insufficient. Instead, research on Trustworthy AI must account for interactions between aspects and adopt a holistic view across all relevant axes at once. To illustrate our perspective, we provide guidance on how practitioners can work towards integrated trust, examples of how interactions affect the financial industry, and alternative views.
Related papers
- Bridging the Gap: Integrating Ethics and Environmental Sustainability in AI Research and Practice [57.94036023167952]
We argue that the efforts aiming to study AI's ethical ramifications should be made in tandem with those evaluating its impacts on the environment.<n>We propose best practices to better integrate AI ethics and sustainability in AI research and practice.
arXiv Detail & Related papers (2025-04-01T13:53:11Z) - REVAL: A Comprehension Evaluation on Reliability and Values of Large Vision-Language Models [59.445672459851274]
REVAL is a comprehensive benchmark designed to evaluate the textbfREliability and textbfVALue of Large Vision-Language Models.<n>REVAL encompasses over 144K image-text Visual Question Answering (VQA) samples, structured into two primary sections: Reliability and Values.<n>We evaluate 26 models, including mainstream open-source LVLMs and prominent closed-source models like GPT-4o and Gemini-1.5-Pro.
arXiv Detail & Related papers (2025-03-20T07:54:35Z) - A Tutorial On Intersectionality in Fair Rankings [1.4883782513177093]
biases can lead to discriminatory outcomes in a data-driven world.<n>Efforts towards responsible data science and responsible artificial intelligence aim to mitigate these biases.
arXiv Detail & Related papers (2025-02-07T21:14:21Z) - PRISM: Perspective Reasoning for Integrated Synthesis and Mediation as a Multi-Perspective Framework for AI Alignment [0.0]
Perspective Reasoning for Integrated Synthesis and Mediation (PRISM) is a framework for addressing persistent challenges in AI alignment.<n>PRISM organizes moral concerns into seven "basis worldviews", each hypothesized to capture a distinct dimension of human moral cognition.<n>We briefly outline future directions, including real-world deployments and formal verifications, while maintaining the core focus on multi-perspective synthesis and conflict mediation.
arXiv Detail & Related papers (2025-02-05T02:13:57Z) - On the Fairness, Diversity and Reliability of Text-to-Image Generative Models [68.62012304574012]
multimodal generative models have sparked critical discussions on their reliability, fairness and potential for misuse.<n>We propose an evaluation framework to assess model reliability by analyzing responses to global and local perturbations in the embedding space.<n>Our method lays the groundwork for detecting unreliable, bias-injected models and tracing the provenance of embedded biases.
arXiv Detail & Related papers (2024-11-21T09:46:55Z) - Navigating Conflicting Views: Harnessing Trust for Learning [5.776290041122041]
We develop a computational trust-based discounting method to enhance the existing trustworthy framework.<n>We evaluate our method on six real-world datasets, using Top-1 Accuracy, AUC-ROC for Uncertainty-Aware Prediction, Fleiss' Kappa, and a new metric called Multi-View Agreement with Ground Truth.
arXiv Detail & Related papers (2024-06-03T03:22:18Z) - Human-in-the-loop Fairness: Integrating Stakeholder Feedback to Incorporate Fairness Perspectives in Responsible AI [4.0247545547103325]
Fairness is a growing concern for high-risk decision-making using Artificial Intelligence (AI)
There is no universally accepted fairness measure, fairness is context-dependent, and there might be conflicting perspectives on what is considered fair.
Our work follows an approach where stakeholders can give feedback on specific decision instances and their outcomes with respect to their fairness.
arXiv Detail & Related papers (2023-12-13T11:17:29Z) - A Systematic Review on Fostering Appropriate Trust in Human-AI
Interaction [19.137907393497848]
Appropriate Trust in Artificial Intelligence (AI) systems has rapidly become an important area of focus for both researchers and practitioners.
Various approaches have been used to achieve it, such as confidence scores, explanations, trustworthiness cues, or uncertainty communication.
This paper presents a systematic review to identify current practices in building appropriate trust, different ways to measure it, types of tasks used, and potential challenges associated with it.
arXiv Detail & Related papers (2023-11-08T12:19:58Z) - Holistic Survey of Privacy and Fairness in Machine Learning [10.399352534861292]
Privacy and fairness are crucial pillars of responsible Artificial Intelligence (AI) and trustworthy Machine Learning (ML)
Despite significant interest, there remains an immediate demand for more in-depth research to unravel how these two objectives can be simultaneously integrated into ML models.
We provide a thorough review of privacy and fairness in ML, including supervised, unsupervised, semi-supervised, and reinforcement learning.
arXiv Detail & Related papers (2023-07-28T23:39:29Z) - Why not both? Complementing explanations with uncertainty, and the role
of self-confidence in Human-AI collaboration [12.47276164048813]
We conduct an empirical study to identify how uncertainty estimates and model explanations affect users' reliance, understanding, and trust towards a model.
We also discuss how the latter may distort the outcome of an analysis based on agreement and switching percentages.
arXiv Detail & Related papers (2023-04-27T12:24:33Z) - Auditing and Generating Synthetic Data with Controllable Trust Trade-offs [54.262044436203965]
We introduce a holistic auditing framework that comprehensively evaluates synthetic datasets and AI models.
It focuses on preventing bias and discrimination, ensures fidelity to the source data, assesses utility, robustness, and privacy preservation.
We demonstrate the framework's effectiveness by auditing various generative models across diverse use cases.
arXiv Detail & Related papers (2023-04-21T09:03:18Z) - Factoring the Matrix of Domination: A Critical Review and Reimagination
of Intersectionality in AI Fairness [55.037030060643126]
Intersectionality is a critical framework that allows us to examine how social inequalities persist.
We argue that adopting intersectionality as an analytical framework is pivotal to effectively operationalizing fairness.
arXiv Detail & Related papers (2023-03-16T21:02:09Z) - Modeling Multiple Views via Implicitly Preserving Global Consistency and
Local Complementarity [61.05259660910437]
We propose a global consistency and complementarity network (CoCoNet) to learn representations from multiple views.
On the global stage, we reckon that the crucial knowledge is implicitly shared among views, and enhancing the encoder to capture such knowledge can improve the discriminability of the learned representations.
Lastly on the local stage, we propose a complementarity-factor, which joints cross-view discriminative knowledge, and it guides the encoders to learn not only view-wise discriminability but also cross-view complementary information.
arXiv Detail & Related papers (2022-09-16T09:24:00Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards
Individualized and Explainable Robotic Support in Everyday Activities [80.37857025201036]
Key challenge for robotic systems is to figure out the behavior of another agent.
Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally.
We propose equipping robots with the necessary tools to conduct observational studies on people.
arXiv Detail & Related papers (2022-01-27T22:15:56Z) - Uncertainty as a Form of Transparency: Measuring, Communicating, and
Using Uncertainty [66.17147341354577]
We argue for considering a complementary form of transparency by estimating and communicating the uncertainty associated with model predictions.
We describe how uncertainty can be used to mitigate model unfairness, augment decision-making, and build trustworthy systems.
This work constitutes an interdisciplinary review drawn from literature spanning machine learning, visualization/HCI, design, decision-making, and fairness.
arXiv Detail & Related papers (2020-11-15T17:26:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.