Related papers: Learning Causal Mechanisms through Orthogonal Neural Networks

Learning Causal Mechanisms through Orthogonal Neural Networks

URL: http://arxiv.org/abs/2306.03938v1
Date: Mon, 5 Jun 2023 13:11:33 GMT
Title: Learning Causal Mechanisms through Orthogonal Neural Networks
Authors: Peyman Sheikholharam Mashhadi, Slawomir Nowaczyk
Abstract summary: We investigate a problem of learning, in a fully unsupervised manner, the inverse of a set of independent mechanisms from distorted data points. We propose an unsupervised method that discovers and disentangles a set of independent mechanisms from unlabeled data, and learns how to invert them.
Score: 2.77390041716769
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A fundamental feature of human intelligence is the ability to infer high-level abstractions from low-level sensory data. An essential component of such inference is the ability to discover modularized generative mechanisms. Despite many efforts to use statistical learning and pattern recognition for finding disentangled factors, arguably human intelligence remains unmatched in this area. In this paper, we investigate a problem of learning, in a fully unsupervised manner, the inverse of a set of independent mechanisms from distorted data points. We postulate, and justify this claim with experimental results, that an important weakness of existing machine learning solutions lies in the insufficiency of cross-module diversification. Addressing this crucial discrepancy between human and machine intelligence is an important challenge for pattern recognition systems. To this end, our work proposes an unsupervised method that discovers and disentangles a set of independent mechanisms from unlabeled data, and learns how to invert them. A number of experts compete against each other for individual data points in an adversarial setting: one that best inverses the (unknown) generative mechanism is the winner. We demonstrate that introducing an orthogonalization layer into the expert architectures enforces additional diversity in the outputs, leading to significantly better separability. Moreover, we propose a procedure for relocating data points between experts to further prevent any one from claiming multiple mechanisms. We experimentally illustrate that these techniques allow discovery and modularization of much less pronounced transformations, in addition to considerably faster convergence.

Related papers

An Augmentation-Aware Theory for Self-Supervised Contrastive Learning [25.01234368914713]
We propose an augmentation-aware error bound for self-supervised contrastive learning.<n>We show that the supervised risk is bounded not only by the unsupervised risk, but also explicitly by a trade-off induced by data augmentation.
arXiv Detail & Related papers (2025-05-28T10:18:20Z)
Decomposed Inductive Procedure Learning: Learning Academic Tasks with Human-Like Data Efficiency [1.9165956916475038]
We find that decomposing learning into multiple distinct mechanisms significantly improves data efficiency.<n>Our findings suggest that integrating multiple specialized learning mechanisms may be key to bridging this gap.
arXiv Detail & Related papers (2025-05-15T15:39:09Z)
Interpretable Feature Interaction via Statistical Self-supervised Learning on Tabular Data [22.20955211690874]
Spofe is a novel self-supervised machine learning pipeline that captures principled representation to achieve clear interpretability with statistical rigor. Underpinning our approach is a robust theoretical framework that delivers precise error bounds and rigorous false discovery rate (FDR) control. Experiments on diverse real-world datasets demonstrate the effectiveness of Spofe.
arXiv Detail & Related papers (2025-03-23T12:27:42Z)
Adversarial Mixup Unlearning [16.89710766008491]
We introduce a novel approach that regularizes the unlearning process by utilizing synthesized mixup samples. At the core of our approach is a generator-unlearner framework, MixUnlearn. We show that our method significantly outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2025-02-14T16:50:33Z)
Identifiable Causal Representation Learning: Unsupervised, Multi-View, and Multi-Environment [10.814585613336778]
Causal representation learning aims to combine the core strengths of machine learning and causality. This thesis investigates what is possible for CRL without direct supervision, and thus contributes to its theoretical foundations.
arXiv Detail & Related papers (2024-06-19T09:14:40Z)
Self-Distilled Disentangled Learning for Counterfactual Prediction [49.84163147971955]
We propose the Self-Distilled Disentanglement framework, known as $SD2$. Grounded in information theory, it ensures theoretically sound independent disentangled representations without intricate mutual information estimator designs. Our experiments, conducted on both synthetic and real-world datasets, confirm the effectiveness of our approach.
arXiv Detail & Related papers (2024-06-09T16:58:19Z)
Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms. We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z)
A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies. Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance. Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z)
Feature Interaction Aware Automated Data Representation Transformation [27.26916497306978]
We develop a hierarchical reinforcement learning structure with cascading Markov Decision Processes to automate feature and operation selection. We reward agents based on the interaction strength between selected features, resulting in intelligent and efficient exploration of the feature space that emulates human decision-making.
arXiv Detail & Related papers (2023-09-29T06:48:16Z)
Understanding Data Augmentation from a Robustness Perspective [10.063624819905508]
Data augmentation stands out as a pivotal technique to amplify model robustness. This manuscript takes both a theoretical and empirical approach to understanding the phenomenon. Our empirical evaluations dissect the intricate mechanisms of emblematic data augmentation strategies. These insights provide a novel lens through which we can re-evaluate model safety and robustness in visual recognition tasks.
arXiv Detail & Related papers (2023-09-07T10:54:56Z)
Rotating Features for Object Discovery [74.1465486264609]
We present Rotating Features, a generalization of complex-valued features to higher dimensions, and a new evaluation procedure for extracting objects from distributed representations. Together, these advancements enable us to scale distributed object-centric representations from simple toy to real-world data.
arXiv Detail & Related papers (2023-06-01T12:16:26Z)
Few-shot Weakly-supervised Cybersecurity Anomaly Detection [1.179179628317559]
We propose an enhancement to an existing few-shot weakly-supervised deep learning anomaly detection framework. This framework incorporates data augmentation, representation learning and ordinal regression. We then evaluated and showed the performance of our implemented framework on three benchmark datasets.
arXiv Detail & Related papers (2023-04-15T04:37:54Z)
Properties from Mechanisms: An Equivariance Perspective on Identifiable Representation Learning [79.4957965474334]
Key goal of unsupervised representation learning is "inverting" a data generating process to recover its latent properties. This paper asks, "Can we instead identify latent properties by leveraging knowledge of the mechanisms that govern their evolution?" We provide a complete characterization of the sources of non-identifiability as we vary knowledge about a set of possible mechanisms.
arXiv Detail & Related papers (2021-10-29T14:04:08Z)
Vulnerability Under Adversarial Machine Learning: Bias or Variance? [77.30759061082085]
We investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network. Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation. We introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies.
arXiv Detail & Related papers (2020-08-01T00:58:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.