Alignment with human representations supports robust few-shot learning
- URL: http://arxiv.org/abs/2301.11990v3
- Date: Sun, 29 Oct 2023 19:45:09 GMT
- Title: Alignment with human representations supports robust few-shot learning
- Authors: Ilia Sucholutsky, Thomas L. Griffiths
- Abstract summary: We show there should be a U-shaped relationship between the degree of representational alignment with humans and performance on few-shot learning tasks.
We also show that highly-aligned models are more robust to both natural adversarial attacks and domain shifts.
Our results suggest that human-alignment is often a sufficient, but not necessary, condition for models to make effective use of limited data, be robust, and generalize well.
- Score: 14.918671859247429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Should we care whether AI systems have representations of the world that are
similar to those of humans? We provide an information-theoretic analysis that
suggests that there should be a U-shaped relationship between the degree of
representational alignment with humans and performance on few-shot learning
tasks. We confirm this prediction empirically, finding such a relationship in
an analysis of the performance of 491 computer vision models. We also show that
highly-aligned models are more robust to both natural adversarial attacks and
domain shifts. Our results suggest that human-alignment is often a sufficient,
but not necessary, condition for models to make effective use of limited data,
be robust, and generalize well.
Related papers
- Evaluating Multiview Object Consistency in Humans and Image Models [68.36073530804296]
We leverage an experimental design from the cognitive sciences which requires zero-shot visual inferences about object shape.
We collect 35K trials of behavioral data from over 500 participants.
We then evaluate the performance of common vision models.
arXiv Detail & Related papers (2024-09-09T17:59:13Z) - VFA: Vision Frequency Analysis of Foundation Models and Human [10.112417527529868]
Machine learning models often struggle with distribution shifts in real-world scenarios, whereas humans exhibit robust adaptation.
We investigate how various characteristics of large-scale computer vision models influence their alignment with human capabilities and robustness.
arXiv Detail & Related papers (2024-09-09T17:23:39Z) - Position: Stop Making Unscientific AGI Performance Claims [6.343515088115924]
Developments in the field of Artificial Intelligence (AI) have created a 'perfect storm' for observing'sparks' of Artificial General Intelligence (AGI)
We argue and empirically demonstrate that the finding of meaningful patterns in latent spaces of models cannot be seen as evidence in favor of AGI.
We conclude that both the methodological setup and common public image of AI are ideal for the misinterpretation that correlations between model representations and some variables of interest are 'caused' by the model's understanding of underlying 'ground truth' relationships.
arXiv Detail & Related papers (2024-02-06T12:42:21Z) - Specify Robust Causal Representation from Mixed Observations [35.387451486213344]
Learning representations purely from observations concerns the problem of learning a low-dimensional, compact representation which is beneficial to prediction models.
We develop a learning method to learn such representation from observational data by regularizing the learning procedure with mutual information measures.
We theoretically and empirically show that the models trained with the learned causal representations are more robust under adversarial attacks and distribution shifts.
arXiv Detail & Related papers (2023-10-21T02:18:35Z) - Towards Understanding Sycophancy in Language Models [49.99654432561934]
We investigate the prevalence of sycophancy in models whose finetuning procedure made use of human feedback.
We show that five state-of-the-art AI assistants consistently exhibit sycophancy across four varied free-form text-generation tasks.
Our results indicate that sycophancy is a general behavior of state-of-the-art AI assistants, likely driven in part by human preference judgments favoring sycophantic responses.
arXiv Detail & Related papers (2023-10-20T14:46:48Z) - Interpretable Computer Vision Models through Adversarial Training:
Unveiling the Robustness-Interpretability Connection [0.0]
Interpretability is as essential as robustness when we deploy the models to the real world.
Standard models, compared to robust are more susceptible to adversarial attacks, and their learned representations are less meaningful to humans.
arXiv Detail & Related papers (2023-07-04T13:51:55Z) - Exploring Alignment of Representations with Human Perception [47.53970721813083]
We show that inputs that are mapped to similar representations by the model should be perceived similarly by humans.
Our approach yields a measure of the extent to which a model is aligned with human perception.
We find that various properties of a model like its architecture, training paradigm, training loss, and data augmentation play a significant role in learning representations that are aligned with human perception.
arXiv Detail & Related papers (2021-11-29T17:26:50Z) - Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions.
We make robust and efficient counterfactual predictions for both individual and average treatment effects.
The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z) - DRG: Dual Relation Graph for Human-Object Interaction Detection [65.50707710054141]
We tackle the challenging problem of human-object interaction (HOI) detection.
Existing methods either recognize the interaction of each human-object pair in isolation or perform joint inference based on complex appearance-based features.
In this paper, we leverage an abstract spatial-semantic representation to describe each human-object pair and aggregate the contextual information of the scene via a dual relation graph.
arXiv Detail & Related papers (2020-08-26T17:59:40Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.