Quantifying Correlations of Machine Learning Models
- URL: http://arxiv.org/abs/2502.03937v1
- Date: Thu, 06 Feb 2025 10:19:51 GMT
- Title: Quantifying Correlations of Machine Learning Models
- Authors: Yuanyuan Li, Neeraj Sarna, Yang Lin,
- Abstract summary: This paper explores three scenarios where error correlations between multiple models arise, resulting in aggregated risks.
Our findings indicate that aggregated risks are substantial, particularly when models share similar algorithms, training datasets, or foundational models.
Overall, we observe that correlations across models are pervasive and likely to intensify with increased reliance on foundational models and widely used public datasets.
- Score: 8.834929420051534
- License:
- Abstract: Machine Learning models are being extensively used in safety critical applications where errors from these models could cause harm to the user. Such risks are amplified when multiple machine learning models, which are deployed concurrently, interact and make errors simultaneously. This paper explores three scenarios where error correlations between multiple models arise, resulting in such aggregated risks. Using real-world data, we simulate these scenarios and quantify the correlations in errors of different models. Our findings indicate that aggregated risks are substantial, particularly when models share similar algorithms, training datasets, or foundational models. Overall, we observe that correlations across models are pervasive and likely to intensify with increased reliance on foundational models and widely used public datasets, highlighting the need for effective mitigation strategies to address these challenges.
Related papers
- Assessing Robustness of Machine Learning Models using Covariate Perturbations [0.6749750044497732]
This paper proposes a comprehensive framework for assessing the robustness of machine learning models.
We explore various perturbation strategies to assess robustness and examine their impact on model predictions.
We demonstrate the effectiveness of our approach in comparing robustness across models, identifying the instabilities in the model, and enhancing model robustness.
arXiv Detail & Related papers (2024-08-02T14:41:36Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Identifying and Mitigating Model Failures through Few-shot CLIP-aided
Diffusion Generation [65.268245109828]
We propose an end-to-end framework to generate text descriptions of failure modes associated with spurious correlations.
These descriptions can be used to generate synthetic data using generative models, such as diffusion models.
Our experiments have shown remarkable textbfimprovements in accuracy ($sim textbf21%$) on hard sub-populations.
arXiv Detail & Related papers (2023-12-09T04:43:49Z) - Generative Machine Learning for Multivariate Equity Returns [0.0]
We study the efficacy of conditional importance weighted autoencoders and conditional normalizing flows for the task of modeling the returns of equities.
The main problem we work to address is modeling the joint distribution of all the members of the S&P 500, or, in other words, learning a 500-dimensional joint distribution.
We show that this generative model has a broad range of applications in finance, including generating realistic synthetic data, volatility and correlation estimation, risk analysis, and portfolio optimization.
arXiv Detail & Related papers (2023-11-21T18:41:48Z) - Correlation inference attacks against machine learning models [6.805105137455252]
We explore correlation inference attacks, whether and when a model leaks information about the correlations between its input variables.
Our results raise fundamental questions on what a model does and should remember from its training set.
arXiv Detail & Related papers (2021-12-16T11:42:45Z) - A Framework for Machine Learning of Model Error in Dynamical Systems [7.384376731453594]
We present a unifying framework for blending mechanistic and machine-learning approaches to identify dynamical systems from data.
We cast the problem in both continuous- and discrete-time, for problems in which the model error is memoryless and in which it has significant memory.
We find that hybrid methods substantially outperform solely data-driven approaches in terms of data hunger, demands for model complexity, and overall predictive performance.
arXiv Detail & Related papers (2021-07-14T12:47:48Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Relating by Contrasting: A Data-efficient Framework for Multimodal
Generative Models [86.9292779620645]
We develop a contrastive framework for generative model learning, allowing us to train the model not just by the commonality between modalities, but by the distinction between "related" and "unrelated" multimodal data.
Under our proposed framework, the generative model can accurately identify related samples from unrelated ones, making it possible to make use of the plentiful unlabeled, unpaired multimodal data.
arXiv Detail & Related papers (2020-07-02T15:08:11Z) - Debiasing Skin Lesion Datasets and Models? Not So Fast [17.668005682385175]
Models learned from data risk learning biases from that same data.
When models learn spurious correlations not found in real-world situations, their deployment for critical tasks, such as medical decisions, can be catastrophic.
We find out that, despite interesting results that point to promising future research, current debiasing methods are not ready to solve the bias issue for skin-lesion models.
arXiv Detail & Related papers (2020-04-23T21:07:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.