Mixed-feature Logistic Regression Robust to Distribution Shifts
- URL: http://arxiv.org/abs/2503.12012v1
- Date: Sat, 15 Mar 2025 06:31:16 GMT
- Title: Mixed-feature Logistic Regression Robust to Distribution Shifts
- Authors: Qingshi Sun, Nathan Justin, Andres Gomez, Phebe Vayanos,
- Abstract summary: We study a distributionally robust logistic regression problem that seeks the model that will perform best against adversarial realizations of the data distribution.<n>We propose a graph-based solution approach that can be integrated into off-the-shelf optimization solvers.
- Score: 1.957963207352318
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Logistic regression models are widely used in the social and behavioral sciences and in high-stakes domains, due to their simplicity and interpretability properties. At the same time, such domains are permeated by distribution shifts, where the distribution generating the data changes between training and deployment. In this paper, we study a distributionally robust logistic regression problem that seeks the model that will perform best against adversarial realizations of the data distribution drawn from a suitably constructed Wasserstein ambiguity set. Our model and solution approach differ from prior work in that we can capture settings where the likelihood of distribution shifts can vary across features, significantly broadening the applicability of our model relative to the state-of-the-art. We propose a graph-based solution approach that can be integrated into off-the-shelf optimization solvers. We evaluate the performance of our model and algorithms on numerous publicly available datasets. Our solution achieves a 408x speed-up relative to the state-of-the-art. Additionally, compared to the state-of-the-art, our model reduces average calibration error by up to 36.19% and worst-case calibration error by up to 41.70%, while increasing the average area under the ROC curve (AUC) by up to 18.02% and worst-case AUC by up to 48.37%.
Related papers
- An Analysis of Model Robustness across Concurrent Distribution Shifts [6.043526197249358]
Machine learning models, meticulously optimized for source data, often fail to predict target data when faced with distribution shifts (DSs)<n>We evaluate 26 algorithms that range from simple augmentations to zero-shot inference using foundation models, across 168 source-target pairs from eight datasets.<n>Our analysis of over 100K models reveals that concurrent DSs typically worsen performance compared to a single shift, with certain exceptions.
arXiv Detail & Related papers (2025-01-08T05:27:16Z) - Improving Out-of-Distribution Data Handling and Corruption Resistance via Modern Hopfield Networks [0.0]
This study explores the potential of Modern Hopfield Networks (MHN) in improving the ability of computer vision models to handle out-of-distribution data.
We suggest integrating MHN into the baseline models to enhance their robustness.
Our research shows that the proposed integration consistently improves model performance on the MNIST-C dataset.
arXiv Detail & Related papers (2024-08-21T03:26:16Z) - Reducing Spurious Correlation for Federated Domain Generalization [15.864230656989854]
In open-world scenarios, global models may struggle to predict well on entirely new domain data captured by certain media.
Existing methods still rely on strong statistical correlations between samples and labels to address this issue.
We introduce FedCD, an overall optimization framework at both the local and global levels.
arXiv Detail & Related papers (2024-07-27T05:06:31Z) - Distributionally and Adversarially Robust Logistic Regression via Intersecting Wasserstein Balls [8.720733751119994]
Adversarially robust optimization (ARO) has become the de facto standard for training models to defend against adversarial attacks during testing.
Despite their robustness, these models often suffer from severe overfitting.
We propose two approaches to replace the empirical distribution in training with: (i) a worst-case distribution within an ambiguity set; or (ii) a mixture of the empirical distribution with one derived from an auxiliary dataset.
arXiv Detail & Related papers (2024-07-18T15:59:37Z) - Distributionally Robust Learning for Multi-source Unsupervised Domain Adaptation [9.359714425373616]
Empirical risk often performs poorly when the distribution of the target domain differs from those of source domains.
We develop an unsupervised domain adaptation approach that leverages labeled data from multiple source domains and unlabeled data from the target domain.
arXiv Detail & Related papers (2023-09-05T13:19:40Z) - Variational Model Perturbation for Source-Free Domain Adaptation [64.98560348412518]
We introduce perturbations into the model parameters by variational Bayesian inference in a probabilistic framework.
We demonstrate the theoretical connection to learning Bayesian neural networks, which proves the generalizability of the perturbed model to target domains.
arXiv Detail & Related papers (2022-10-19T08:41:19Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Model-based Policy Optimization with Unsupervised Model Adaptation [37.09948645461043]
We investigate how to bridge the gap between real and simulated data due to inaccurate model estimation for better policy optimization.
We propose a novel model-based reinforcement learning framework AMPO, which introduces unsupervised model adaptation.
Our approach achieves state-of-the-art performance in terms of sample efficiency on a range of continuous control benchmark tasks.
arXiv Detail & Related papers (2020-10-19T14:19:42Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.