Towards out-of-distribution generalization in large-scale astronomical
surveys: robust networks learn similar representations
- URL: http://arxiv.org/abs/2311.18007v1
- Date: Wed, 29 Nov 2023 19:00:05 GMT
- Title: Towards out-of-distribution generalization in large-scale astronomical
surveys: robust networks learn similar representations
- Authors: Yash Gondhalekar, Sultan Hassan, Naomi Saphra, Sambatra Andrianomena
- Abstract summary: We use Centered Kernel Alignment (CKA), a similarity measure metric of neural network representations, to examine the relationship between representation similarity and performance.
We find that when models are robust to a distribution shift, they produce substantially different representations across their layers on OOD data.
We discuss the potential application of similarity representation in guiding model design, training strategy, and mitigating the OOD problem by incorporating CKA as an inductive bias during training.
- Score: 3.653721769378018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The generalization of machine learning (ML) models to out-of-distribution
(OOD) examples remains a key challenge in extracting information from upcoming
astronomical surveys. Interpretability approaches are a natural way to gain
insights into the OOD generalization problem. We use Centered Kernel Alignment
(CKA), a similarity measure metric of neural network representations, to
examine the relationship between representation similarity and performance of
pre-trained Convolutional Neural Networks (CNNs) on the CAMELS Multifield
Dataset. We find that when models are robust to a distribution shift, they
produce substantially different representations across their layers on OOD
data. However, when they fail to generalize, these representations change less
from layer to layer on OOD data. We discuss the potential application of
similarity representation in guiding model design, training strategy, and
mitigating the OOD problem by incorporating CKA as an inductive bias during
training.
Related papers
- DeCaf: A Causal Decoupling Framework for OOD Generalization on Node Classification [14.96980804513399]
Graph Neural Networks (GNNs) are susceptible to distribution shifts, creating vulnerability and security issues in critical domains.
Existing methods that target learning an invariant (feature, structure)-label mapping often depend on oversimplified assumptions about the data generation process.
We introduce a more realistic graph data generation model using Structural Causal Models (SCMs)
We propose a casual decoupling framework, DeCaf, that independently learns unbiased feature-label and structure-label mappings.
arXiv Detail & Related papers (2024-10-27T00:22:18Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe.
GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z) - How robust are pre-trained models to distribution shift? [82.08946007821184]
We show how spurious correlations affect the performance of popular self-supervised learning (SSL) and auto-encoder based models (AE)
We develop a novel evaluation scheme with the linear head trained on out-of-distribution (OOD) data, to isolate the performance of the pre-trained models from a potential bias of the linear head used for evaluation.
arXiv Detail & Related papers (2022-06-17T16:18:28Z) - On Generalisability of Machine Learning-based Network Intrusion
Detection Systems [0.0]
In this paper, we evaluate seven supervised and unsupervised learning models on four benchmark NIDS datasets.
Our investigation indicates that none of the considered models is able to generalise over all studied datasets.
Our investigation also indicates that overall, unsupervised learning methods generalise better than supervised learning models in our considered scenarios.
arXiv Detail & Related papers (2022-05-09T08:26:48Z) - Rethinking Machine Learning Robustness via its Link with the
Out-of-Distribution Problem [16.154434566725012]
We investigate the causes behind machine learning models' susceptibility to adversarial examples.
We propose an OOD generalization method that stands against both adversary-induced and natural distribution shifts.
Our approach consistently improves robustness to OOD adversarial inputs and outperforms state-of-the-art defenses.
arXiv Detail & Related papers (2022-02-18T00:17:23Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - OODformer: Out-Of-Distribution Detection Transformer [15.17006322500865]
In real-world safety-critical applications, it is important to be aware if a new data point is OOD.
This paper proposes a first-of-its-kind OOD detection architecture named OODformer.
arXiv Detail & Related papers (2021-07-19T15:46:38Z) - Evading the Simplicity Bias: Training a Diverse Set of Models Discovers
Solutions with Superior OOD Generalization [93.8373619657239]
Neural networks trained with SGD were recently shown to rely preferentially on linearly-predictive features.
This simplicity bias can explain their lack of robustness out of distribution (OOD)
We demonstrate that the simplicity bias can be mitigated and OOD generalization improved.
arXiv Detail & Related papers (2021-05-12T12:12:24Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.