Related papers: Understanding Feature Transfer Through Representation Alignment

Understanding Feature Transfer Through Representation Alignment

URL: http://arxiv.org/abs/2112.07806v1
Date: Wed, 15 Dec 2021 00:20:29 GMT
Title: Understanding Feature Transfer Through Representation Alignment
Authors: Ehsan Imani, Wei Hu, Martha White
Abstract summary: We find that training neural networks with different architectures and generalizations on random or true labels enforces the same relationship between the hidden representations and the training labels. We show in a classic synthetic transfer problem that alignment is the determining factor for positive and negative transfer to similar and dissimilar tasks.
Score: 45.35473578109525
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training with the true labels of a dataset as opposed to randomized labels leads to faster optimization and better generalization. This difference is attributed to a notion of alignment between inputs and labels in natural datasets. We find that training neural networks with different architectures and optimizers on random or true labels enforces the same relationship between the hidden representations and the training labels, elucidating why neural network representations have been so successful for transfer. We first highlight why aligned features promote transfer and show in a classic synthetic transfer problem that alignment is the determining factor for positive and negative transfer to similar and dissimilar tasks. We then investigate a variety of neural network architectures and find that (a) alignment emerges across a variety of different architectures and optimizers, with more alignment arising from depth (b) alignment increases for layers closer to the output and (c) existing high-performance deep CNNs exhibit high levels of alignment.

Related papers

Bridging Critical Gaps in Convergent Learning: How Representational Alignment Evolves Across Layers, Training, and Distribution Shifts [1.9458156037869137]
Much existing work relies on a limited set of metrics, overlooking transformation invariances required for proper alignment. A second critical gap lies in understanding when alignment emerges during training. Contrary to expectations that convergence builds gradually with task-specific learning, our findings reveal that nearly all convergence occurs within the first epoch. These findings fill critical gaps in our understanding of representational convergence, with implications for neuroscience and AI.
arXiv Detail & Related papers (2025-02-26T00:04:24Z)
Matcha: Mitigating Graph Structure Shifts with Test-Time Adaptation [66.40525136929398]
Test-time adaptation (TTA) has attracted attention due to its ability to adapt a pre-trained model to a target domain, without re-accessing the source domain. We propose Matcha, an innovative framework designed for effective and efficient adaptation to structure shifts in graphs. We validate the effectiveness of Matcha on both synthetic and real-world datasets, demonstrating its robustness across various combinations of structure and attribute shifts.
arXiv Detail & Related papers (2024-10-09T15:15:40Z)
On Characterizing the Evolution of Embedding Space of Neural Networks using Algebraic Topology [9.537910170141467]
We study how the topology of feature embedding space changes as it passes through the layers of a well-trained deep neural network (DNN) through Betti numbers. We demonstrate that as depth increases, a topologically complicated dataset is transformed into a simple one, resulting in Betti numbers attaining their lowest possible value.
arXiv Detail & Related papers (2023-11-08T10:45:12Z)
Domain-adaptive Message Passing Graph Neural Network [67.35534058138387]
Cross-network node classification (CNNC) aims to classify nodes in a label-deficient target network by transferring the knowledge from a source network with abundant labels. We propose a domain-adaptive message passing graph neural network (DM-GNN), which integrates graph neural network (GNN) with conditional adversarial domain adaptation.
arXiv Detail & Related papers (2023-08-31T05:26:08Z)
ReduNet: A White-box Deep Network from the Principle of Maximizing Rate Reduction [32.489371527159236]
This work attempts to provide a plausible theoretical framework that aims to interpret modern deep (convolutional) networks from the principles of data compression and discriminative representation. We show that for high-dimensional multi-class data, the optimal linear discriminative representation maximizes the coding rate difference between the whole dataset and the average of all the subsets. We show that the basic iterative gradient ascent scheme for optimizing the rate reduction objective naturally leads to a multi-layer deep network, named ReduNet, that shares common characteristics of modern deep networks.
arXiv Detail & Related papers (2021-05-21T16:29:57Z)
Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths. Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z)
Dual-constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net. To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network. Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z)
Self-Challenging Improves Cross-Domain Generalization [81.99554996975372]
Convolutional Neural Networks (CNN) conduct image classification by activating dominant features that correlated with labels. We introduce a simple training, Self-Challenging Representation (RSC), that significantly improves the generalization of CNN to the out-of-domain data. RSC iteratively challenges the dominant features activated on the training data, and forces the network to activate remaining features that correlates with labels.
arXiv Detail & Related papers (2020-07-05T21:42:26Z)
What Do Neural Networks Learn When Trained With Random Labels? [20.54410239839646]
We study deep neural networks (DNNs) trained on natural image data with entirely random labels. We show analytically for convolutional and fully connected networks that an alignment between the principal components of network parameters and data takes place when training with random labels. We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch.
arXiv Detail & Related papers (2020-06-18T12:07:22Z)
Multi-Subspace Neural Network for Image Recognition [33.61205842747625]
In image classification task, feature extraction is always a big issue. Intra-class variability increases the difficulty in designing the extractors. Recently, deep learning has drawn lots of attention on automatically learning features from data. In this study, we proposed multi-subspace neural network (MSNN) which integrates key components of the convolutional neural network (CNN), receptive field, with subspace concept.
arXiv Detail & Related papers (2020-06-17T02:55:34Z)
Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability. Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network. Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.