The Clever Hans Effect in Unsupervised Learning
- URL: http://arxiv.org/abs/2408.08041v1
- Date: Thu, 15 Aug 2024 09:19:42 GMT
- Title: The Clever Hans Effect in Unsupervised Learning
- Authors: Jacob Kauffmann, Jonas Dippel, Lukas Ruff, Wojciech Samek, Klaus-Robert Müller, Grégoire Montavon,
- Abstract summary: We show for the first time that Clever Hans effects are widespread in unsupervised learning.
Our work sheds light on unexplored risks associated with practical applications of unsupervised learning.
- Score: 24.107672144631326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised learning has become an essential building block of AI systems. The representations it produces, e.g. in foundation models, are critical to a wide variety of downstream applications. It is therefore important to carefully examine unsupervised models to ensure not only that they produce accurate predictions, but also that these predictions are not "right for the wrong reasons", the so-called Clever Hans (CH) effect. Using specially developed Explainable AI techniques, we show for the first time that CH effects are widespread in unsupervised learning. Our empirical findings are enriched by theoretical insights, which interestingly point to inductive biases in the unsupervised learning machine as a primary source of CH effects. Overall, our work sheds light on unexplored risks associated with practical applications of unsupervised learning and suggests ways to make unsupervised learning more robust.
Related papers
- Beyond Statistical Learning: Exact Learning Is Essential for General Intelligence [59.07578850674114]
Sound deductive reasoning is an indisputably desirable aspect of general intelligence.<n>It is well-documented that even the most advanced frontier systems regularly and consistently falter on easily-solvable reasoning tasks.<n>We argue that their unsound behavior is a consequence of the statistical learning approach powering their development.
arXiv Detail & Related papers (2025-06-30T14:37:50Z) - An Augmentation-Aware Theory for Self-Supervised Contrastive Learning [25.01234368914713]
We propose an augmentation-aware error bound for self-supervised contrastive learning.<n>We show that the supervised risk is bounded not only by the unsupervised risk, but also explicitly by a trade-off induced by data augmentation.
arXiv Detail & Related papers (2025-05-28T10:18:20Z) - Uncertainties of Latent Representations in Computer Vision [2.33877878310217]
This thesis makes uncertainty estimates easily accessible by adding them to the latent representation vectors of pretrained computer vision models.
We show that these unobservable uncertainties about unobservable latent representations are indeed provably correct.
arXiv Detail & Related papers (2024-08-26T14:02:30Z) - Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake
Analysis [127.85293480405082]
The rapid development of large language models (LLMs) has not only provided numerous opportunities but also presented significant challenges.
Existing alignment methods usually direct LLMs toward the favorable outcomes by utilizing human-annotated, flawless instruction-response pairs.
This study proposes a novel alignment technique based on mistake analysis, which deliberately exposes LLMs to erroneous content to learn the reasons for mistakes and how to avoid them.
arXiv Detail & Related papers (2023-10-16T14:59:10Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Adversarial Robustness in Unsupervised Machine Learning: A Systematic
Review [0.0]
This paper conducts a systematic literature review on the robustness of unsupervised learning.
Based on the results, we formulate a model on the properties of an attack on unsupervised learning.
arXiv Detail & Related papers (2023-06-01T13:59:32Z) - Weakly-supervised HOI Detection via Prior-guided Bi-level Representation
Learning [66.00600682711995]
Human object interaction (HOI) detection plays a crucial role in human-centric scene understanding and serves as a fundamental building-block for many vision tasks.
One generalizable and scalable strategy for HOI detection is to use weak supervision, learning from image-level annotations only.
This is inherently challenging due to ambiguous human-object associations, large search space of detecting HOIs and highly noisy training signal.
We develop a CLIP-guided HOI representation capable of incorporating the prior knowledge at both image level and HOI instance level, and adopt a self-taught mechanism to prune incorrect human-object associations.
arXiv Detail & Related papers (2023-03-02T14:41:31Z) - Improving Self-supervised Learning with Automated Unsupervised Outlier
Arbitration [83.29856873525674]
We introduce a lightweight latent variable model UOTA, targeting the view sampling issue for self-supervised learning.
Our method directly generalizes to many mainstream self-supervised learning approaches.
arXiv Detail & Related papers (2021-12-15T14:05:23Z) - Pulling Up by the Causal Bootstraps: Causal Data Augmentation for
Pre-training Debiasing [14.4304416146106]
We study and extend a causal pre-training debiasing technique called causal bootstrapping.
We demonstrate that such a causal pre-training technique can significantly outperform existing base practices.
arXiv Detail & Related papers (2021-08-27T21:42:04Z) - Contrastive Learning Inverts the Data Generating Process [36.30995987986073]
We prove that feedforward models trained with objectives belonging to the commonly used InfoNCE family learn to implicitly invert the underlying generative model of the observed data.
Our theory highlights a fundamental connection between contrastive learning, generative modeling, and nonlinear independent component analysis.
arXiv Detail & Related papers (2021-02-17T16:21:54Z) - A Sober Look at the Unsupervised Learning of Disentangled
Representations and their Evaluation [63.042651834453544]
We show that the unsupervised learning of disentangled representations is impossible without inductive biases on both the models and the data.
We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision.
Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision.
arXiv Detail & Related papers (2020-10-27T10:17:15Z) - The Clever Hans Effect in Anomaly Detection [3.278983768346415]
The 'Clever Hans' effect occurs when the learned model produces correct predictions based on the 'wrong' features.
This paper will contribute an explainable AI (XAI) procedure that can highlight the relevant features used by popular anomaly detection models.
arXiv Detail & Related papers (2020-06-18T15:27:05Z) - Explainable Active Learning (XAL): An Empirical Study of How Local
Explanations Impact Annotator Experience [76.9910678786031]
We propose a novel paradigm of explainable active learning (XAL), by introducing techniques from the recently surging field of explainable AI (XAI) into an Active Learning setting.
Our study shows benefits of AI explanation as interfaces for machine teaching--supporting trust calibration and enabling rich forms of teaching feedback, and potential drawbacks--anchoring effect with the model judgment and cognitive workload.
arXiv Detail & Related papers (2020-01-24T22:52:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.