Learning Causal Mechanisms through Orthogonal Neural Networks
- URL: http://arxiv.org/abs/2306.03938v1
- Date: Mon, 5 Jun 2023 13:11:33 GMT
- Title: Learning Causal Mechanisms through Orthogonal Neural Networks
- Authors: Peyman Sheikholharam Mashhadi, Slawomir Nowaczyk
- Abstract summary: We investigate a problem of learning, in a fully unsupervised manner, the inverse of a set of independent mechanisms from distorted data points.
We propose an unsupervised method that discovers and disentangles a set of independent mechanisms from unlabeled data, and learns how to invert them.
- Score: 2.77390041716769
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A fundamental feature of human intelligence is the ability to infer
high-level abstractions from low-level sensory data. An essential component of
such inference is the ability to discover modularized generative mechanisms.
Despite many efforts to use statistical learning and pattern recognition for
finding disentangled factors, arguably human intelligence remains unmatched in
this area.
In this paper, we investigate a problem of learning, in a fully unsupervised
manner, the inverse of a set of independent mechanisms from distorted data
points. We postulate, and justify this claim with experimental results, that an
important weakness of existing machine learning solutions lies in the
insufficiency of cross-module diversification. Addressing this crucial
discrepancy between human and machine intelligence is an important challenge
for pattern recognition systems.
To this end, our work proposes an unsupervised method that discovers and
disentangles a set of independent mechanisms from unlabeled data, and learns
how to invert them. A number of experts compete against each other for
individual data points in an adversarial setting: one that best inverses the
(unknown) generative mechanism is the winner. We demonstrate that introducing
an orthogonalization layer into the expert architectures enforces additional
diversity in the outputs, leading to significantly better separability.
Moreover, we propose a procedure for relocating data points between experts to
further prevent any one from claiming multiple mechanisms. We experimentally
illustrate that these techniques allow discovery and modularization of much
less pronounced transformations, in addition to considerably faster
convergence.
Related papers
- Adversarial Mixup Unlearning [16.89710766008491]
We introduce a novel approach that regularizes the unlearning process by utilizing synthesized mixup samples.
At the core of our approach is a generator-unlearner framework, MixUnlearn.
We show that our method significantly outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2025-02-14T16:50:33Z) - Identifiable Causal Representation Learning: Unsupervised, Multi-View, and Multi-Environment [10.814585613336778]
Causal representation learning aims to combine the core strengths of machine learning and causality.
This thesis investigates what is possible for CRL without direct supervision, and thus contributes to its theoretical foundations.
arXiv Detail & Related papers (2024-06-19T09:14:40Z) - Self-Distilled Disentangled Learning for Counterfactual Prediction [49.84163147971955]
We propose the Self-Distilled Disentanglement framework, known as $SD2$.
Grounded in information theory, it ensures theoretically sound independent disentangled representations without intricate mutual information estimator designs.
Our experiments, conducted on both synthetic and real-world datasets, confirm the effectiveness of our approach.
arXiv Detail & Related papers (2024-06-09T16:58:19Z) - Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms.
We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z) - A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies.
Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance.
Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z) - Understanding Data Augmentation from a Robustness Perspective [10.063624819905508]
Data augmentation stands out as a pivotal technique to amplify model robustness.
This manuscript takes both a theoretical and empirical approach to understanding the phenomenon.
Our empirical evaluations dissect the intricate mechanisms of emblematic data augmentation strategies.
These insights provide a novel lens through which we can re-evaluate model safety and robustness in visual recognition tasks.
arXiv Detail & Related papers (2023-09-07T10:54:56Z) - Rotating Features for Object Discovery [74.1465486264609]
We present Rotating Features, a generalization of complex-valued features to higher dimensions, and a new evaluation procedure for extracting objects from distributed representations.
Together, these advancements enable us to scale distributed object-centric representations from simple toy to real-world data.
arXiv Detail & Related papers (2023-06-01T12:16:26Z) - Few-shot Weakly-supervised Cybersecurity Anomaly Detection [1.179179628317559]
We propose an enhancement to an existing few-shot weakly-supervised deep learning anomaly detection framework.
This framework incorporates data augmentation, representation learning and ordinal regression.
We then evaluated and showed the performance of our implemented framework on three benchmark datasets.
arXiv Detail & Related papers (2023-04-15T04:37:54Z) - Properties from Mechanisms: An Equivariance Perspective on Identifiable
Representation Learning [79.4957965474334]
Key goal of unsupervised representation learning is "inverting" a data generating process to recover its latent properties.
This paper asks, "Can we instead identify latent properties by leveraging knowledge of the mechanisms that govern their evolution?"
We provide a complete characterization of the sources of non-identifiability as we vary knowledge about a set of possible mechanisms.
arXiv Detail & Related papers (2021-10-29T14:04:08Z) - Vulnerability Under Adversarial Machine Learning: Bias or Variance? [77.30759061082085]
We investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network.
Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation.
We introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies.
arXiv Detail & Related papers (2020-08-01T00:58:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.