Related papers: On a Built-in Conflict between Deep Learning and Systematic Generalization

On a Built-in Conflict between Deep Learning and Systematic Generalization

URL: http://arxiv.org/abs/2208.11633v1
Date: Wed, 24 Aug 2022 16:06:36 GMT
Title: On a Built-in Conflict between Deep Learning and Systematic Generalization
Authors: Yuanpeng Li
Abstract summary: Internal function sharing is one of the reasons to weaken o.o.d. or systematic generalization in deep learning. We show such phenomena in standard deep learning models, such as fully connected, convolutional, residual networks, LSTMs, and (Vision) Transformers.
Score: 2.588973722689844
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we hypothesize that internal function sharing is one of the reasons to weaken o.o.d. or systematic generalization in deep learning for classification tasks. Under equivalent prediction, a model partitions an input space into multiple parts separated by boundaries. The function sharing prefers to reuse boundaries, leading to fewer parts for new outputs, which conflicts with systematic generalization. We show such phenomena in standard deep learning models, such as fully connected, convolutional, residual networks, LSTMs, and (Vision) Transformers. We hope this study provides novel insights into systematic generalization and forms a basis for new research directions.

Related papers

Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection [52.490375806093745]
The objective of few-shot object detection (FSOD) is to detect novel objects with few training samples. We introduce the side information to alleviate the negative influences derived from the feature space and sample viewpoints. Our model outperforms the previous state-of-the-art methods, significantly improving the ability of FSOD in most shots/splits.
arXiv Detail & Related papers (2025-04-09T17:24:05Z)
Enabling Systematic Generalization in Abstract Spatial Reasoning through Meta-Learning for Compositionality [20.958479821810762]
We extend the approach of meta-learning for compositionality to the domain of abstract spatial reasoning. Our results show that a transformer-based encoder-decoder model, trained via meta-learning for compositionality, can systematically generalize to previously unseen transformation compositions.
arXiv Detail & Related papers (2025-04-02T07:56:39Z)
Generalization Through the Lens of Learning Dynamics [11.009483845261958]
A machine learning (ML) system must learn to generalize to novel situations in order to yield accurate predictions at deployment. The impressive generalization performance of deep neural networks has stymied theoreticians. This thesis will study the learning dynamics of deep neural networks in both supervised and reinforcement learning tasks.
arXiv Detail & Related papers (2022-12-11T00:07:24Z)
On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet) We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z)
Learning Dynamics and Structure of Complex Systems Using Graph Neural Networks [13.509027957413409]
We trained graph neural networks to fit time series from an example nonlinear dynamical system. We found simple interpretations of the learned representation and model components. We successfully identified a graph translator' between the statistical interactions in belief propagation and parameters of the corresponding trained network.
arXiv Detail & Related papers (2022-02-22T15:58:16Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Distinguishing rule- and exemplar-based generalization in learning systems [10.396761067379195]
We investigate two distinct inductive biases: feature-level bias and exemplar-vs-rule bias. We find that most standard neural network models have a propensity towards exemplar-based extrapolation. We discuss the implications of these findings for research on data augmentation, fairness, and systematic generalization.
arXiv Detail & Related papers (2021-10-08T18:37:59Z)
On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications [13.823089111538128]
We present new and tighter information-theoretic upper bounds for the generalization error of machine learning models, such as neural networks, trained with SGD. Experimental study based on these bounds provide some insights on the SGD training of neural networks.
arXiv Detail & Related papers (2021-10-07T00:53:33Z)
f-Domain-Adversarial Learning: Theory and Algorithms [82.97698406515667]
Unsupervised domain adaptation is used in many machine learning applications where, during training, a model has access to unlabeled data in the target domain. We derive a novel generalization bound for domain adaptation that exploits a new measure of discrepancy between distributions based on a variational characterization of f-divergences.
arXiv Detail & Related papers (2021-06-21T18:21:09Z)
Deep Archimedean Copulas [98.96141706464425]
ACNet is a novel differentiable neural network architecture that enforces structural properties. We show that ACNet is able to both approximate common Archimedean Copulas and generate new copulas which may provide better fits to data.
arXiv Detail & Related papers (2020-12-05T22:58:37Z)
Vulnerability Under Adversarial Machine Learning: Bias or Variance? [77.30759061082085]
We investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network. Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation. We introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies.
arXiv Detail & Related papers (2020-08-01T00:58:54Z)
Self-organizing Democratized Learning: Towards Large-scale Distributed Learning Systems [71.14339738190202]
democratized learning (Dem-AI) lays out a holistic philosophy with underlying principles for building large-scale distributed and democratized machine learning systems. Inspired by Dem-AI philosophy, a novel distributed learning approach is proposed in this paper. The proposed algorithms demonstrate better results in the generalization performance of learning models in agents compared to the conventional FL algorithms.
arXiv Detail & Related papers (2020-07-07T08:34:48Z)
Unsupervised Domain Adaptation in Semantic Segmentation: a Review [22.366638308792734]
The aim of this paper is to give an overview of the recent advancements in the Unsupervised Domain Adaptation (UDA) of deep networks for semantic segmentation.
arXiv Detail & Related papers (2020-05-21T20:10:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.