Related papers: Beyond RMSE and MAE: Introducing EAUC to unmask hidden bias and unfairness in dyadic regression models

Beyond RMSE and MAE: Introducing EAUC to unmask hidden bias and unfairness in dyadic regression models

URL: http://arxiv.org/abs/2401.10690v4
Date: Fri, 07 Mar 2025 08:40:19 GMT
Title: Beyond RMSE and MAE: Introducing EAUC to unmask hidden bias and unfairness in dyadic regression models
Authors: Jorge Paz-Ruza, Amparo Alonso-Betanzos, Bertha Guijarro-Berdiñas, Brais Cancela, Carlos Eiras-Franco,
Abstract summary: We show that non-uniform observed value distributions of individual entities lead to severe biases in state-of-the-art models.<n>We introduce Eccentricity-Area Under the Curve (EAUC) as a novel metric that can quantify it in all studied domains and models.
Score: 5.336076422485076
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Dyadic regression models, which output real-valued predictions for pairs of entities, are fundamental in many domains (e.g. obtaining user-product ratings in Recommender Systems) and promising and under exploration in others (e.g. tuning patient-drug dosages in precision pharmacology). In this work, we prove that non-uniform observed value distributions of individual entities lead to severe biases in state-of-the-art models, skewing predictions towards the average of observed past values for the entity and providing worse-than-random predictive power in eccentric yet crucial cases; we name this phenomenon eccentricity bias. We show that global error metrics like Root Mean Squared Error (RMSE) are insufficient to capture this bias, and we introduce Eccentricity-Area Under the Curve (EAUC) as a novel metric that can quantify it in all studied domains and models. We prove the intuitive interpretation of EAUC by experimenting with naive post-training bias corrections, and theorize other options to use EAUC to guide the construction of fair models. This work contributes a bias-aware evaluation of dyadic regression to prevent unfairness in critical real-world applications of such systems.

Related papers

Causal Inference as Distribution Adaptation: Optimizing ATE Risk under Propensity Uncertainty [0.0]
We reframing ATE estimation as a textitdomain adaptation problem under distribution shift.<n>We propose the textbfJoint Robust Estimator (JRE) to train outcome models jointly.
arXiv Detail & Related papers (2025-12-19T21:40:46Z)
Detecting Prefix Bias in LLM-based Reward Models [4.596249232904721]
We introduce novel methods to detect and evaluate prefix bias in reward models trained on preference datasets.<n>We leverage these metrics to reveal significant biases in preference models across racial and gender dimensions.<n>Our findings highlight the critical need for bias-aware dataset design and evaluation in developing fair and reliable reward models.
arXiv Detail & Related papers (2025-05-13T21:50:03Z)
Identifying and Mitigating Social Bias Knowledge in Language Models [52.52955281662332]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases. FAST surpasses state-of-the-art baselines with superior debiasing performance. This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z)
Fine-Grained Dynamic Framework for Bias-Variance Joint Optimization on Data Missing Not at Random [2.8165314121189247]
In most practical applications such as recommendation systems, display advertising, and so forth, the collected data often contains missing values. We develop a systematic fine-grained dynamic learning framework to jointly optimize bias and variance.
arXiv Detail & Related papers (2024-05-24T10:07:09Z)
MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion. It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space. It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z)
Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age. A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data. In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z)
Balancing Unobserved Confounding with a Few Unbiased Ratings in Debiased Recommendations [4.960902915238239]
We propose a theoretically guaranteed model-agnostic balancing approach that can be applied to any existing debiasing method. The proposed approach makes full use of unbiased data by alternatively correcting model parameters learned with biased data, and adaptively learning balance coefficients of biased samples for further debiasing.
arXiv Detail & Related papers (2023-04-17T08:56:55Z)
Guide the Learner: Controlling Product of Experts Debiasing Method Based on Token Attribution Similarities [17.082695183953486]
A popular workaround is to train a robust model by re-weighting training examples based on a secondary biased model. Here, the underlying assumption is that the biased model resorts to shortcut features. We introduce a fine-tuning strategy that incorporates the similarity between the main and biased model attribution scores in a Product of Experts loss function.
arXiv Detail & Related papers (2023-02-06T15:21:41Z)
Biases in Inverse Ising Estimates of Near-Critical Behaviour [0.0]
Inverse inference allows pairwise interactions to be reconstructed from empirical correlations. We show that estimators used for this inference, such as Pseudo-likelihood (PLM), are biased. Data-driven methods are explored and applied to a functional magnetic resonance imaging (fMRI) dataset from neuroscience.
arXiv Detail & Related papers (2023-01-13T14:01:43Z)
Bias-inducing geometries: an exactly solvable data model with fairness implications [13.690313475721094]
We introduce an exactly solvable high-dimensional model of data imbalance. We analytically unpack the typical properties of learning models trained in this synthetic framework. We obtain exact predictions for the observables that are commonly employed for fairness assessment.
arXiv Detail & Related papers (2022-05-31T16:27:57Z)
General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z)
Identifying and mitigating bias in algorithms used to manage patients in a pandemic [4.756860520861679]
Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset. Models showed a 57% decrease in the number of biased trials. After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955.
arXiv Detail & Related papers (2021-10-30T21:10:56Z)
OMASGAN: Out-of-Distribution Minimum Anomaly Score GAN for Sample Generation on the Boundary [0.0]
Generative models set high likelihood and low reconstruction loss to Out-of-Distribution (OoD) samples. OMASGAN generates, in a negative data augmentation manner, anomalous samples on the estimated distribution boundary. OMASGAN performs retraining by including the abnormal minimum-anomaly-score OoD samples generated on the distribution boundary.
arXiv Detail & Related papers (2021-10-28T16:35:30Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters. We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z)
Loss Estimators Improve Model Generalization [36.520569284970456]
We propose to train a loss estimator alongside the predictive model, using a contrastive training objective, to directly estimate the prediction uncertainties. We show the impact of loss estimators on model generalization, in terms of both its fidelity on in-distribution data and its ability to detect out of distribution samples or new classes unseen during training.
arXiv Detail & Related papers (2021-03-05T16:35:10Z)
UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model. UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data. We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD) UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model [71.9860741092209]
Clinical researchers often select among and evaluate risk prediction models. Standard metrics calculated from retrospective data are only related to model utility under certain assumptions. When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
arXiv Detail & Related papers (2020-06-02T16:26:49Z)
Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance [70.31427277842239]
We introduce a novel debiasing method called confidence regularization. It discourages models from exploiting biases while enabling them to receive enough incentive to learn from all the training examples. We evaluate our method on three NLU tasks and show that, in contrast to its predecessors, it improves the performance on out-of-distribution datasets.
arXiv Detail & Related papers (2020-05-01T11:22:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.