Related papers: Inheritance Between Feedforward and Convolutional Networks via Model Projection

Inheritance Between Feedforward and Convolutional Networks via Model Projection

URL: http://arxiv.org/abs/2602.06245v1
Date: Thu, 05 Feb 2026 22:50:33 GMT
Title: Inheritance Between Feedforward and Convolutional Networks via Model Projection
Authors: Nicolas Ewen, Jairo Diaz-Rodriguez, Kelly Ramsay,
Abstract summary: Techniques for feedforward networks (FFNs) and convolutional networks (CNNs) are frequently reused across families, but the relationship between the underlying model classes is rarely made explicit.<n>We introduce a unified node-level formalization with tensor-valued activations and show that generalized feedforward networks form a strict subset of generalized convolutional networks.<n>Motivated by the mismatch in per-input parameterization between the two families, we propose model projection.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Techniques for feedforward networks (FFNs) and convolutional networks (CNNs) are frequently reused across families, but the relationship between the underlying model classes is rarely made explicit. We introduce a unified node-level formalization with tensor-valued activations and show that generalized feedforward networks form a strict subset of generalized convolutional networks. Motivated by the mismatch in per-input parameterization between the two families, we propose model projection, a parameter-efficient transfer learning method for CNNs that freezes pretrained per-input-channel filters and learns a single scalar gate for each (output channel, input channel) contribution. Projection keeps all convolutional layers adaptable to downstream tasks while substantially reducing the number of trained parameters in convolutional layers. We prove that projected nodes take the generalized FFN form, enabling projected CNNs to inherit feedforward techniques that do not rely on homogeneous layer inputs. Experiments across multiple ImageNet-pretrained backbones and several downstream image classification datasets show that model projection is a strong transfer learning baseline under simple training recipes.

Related papers

Information-Theoretic Greedy Layer-wise Training for Traffic Sign Recognition [0.5024983453990065]
layer-wise training eliminates the need for cross-entropy loss and backpropagation.<n>Most existing layer-wise training approaches have been evaluated only on relatively small datasets.<n>We propose a novel layer-wise training approach based on the recently developed deterministic information bottleneck (DIB) and the matrix-based R'enyi's $alpha$-order entropy functional.
arXiv Detail & Related papers (2025-10-31T17:24:58Z)
ChannelDropBack: Forward-Consistent Stochastic Regularization for Deep Networks [5.00301731167245]
Existing techniques often require modifying the architecture of the network by adding specialized layers. We present ChannelDropBack, a simple regularization approach that introduces randomness only into the backward information flow. It allows for seamless integration into the training process of any model and layers without the need to change its architecture.
arXiv Detail & Related papers (2024-11-16T21:24:44Z)
Neural networks trained with SGD learn distributions of increasing complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics. We then exploit higher-order statistics only later during training. We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z)
ResMLP: Feedforward networks for image classification with data-efficient training [73.26364887378597]
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. We will share our code based on the Timm library and pre-trained models.
arXiv Detail & Related papers (2021-05-07T17:31:44Z)
Channel Scaling: A Scale-and-Select Approach for Transfer Learning [2.6304695993930594]
Transfer learning with pre-trained neural networks is a common strategy for training classifiers in medical image analysis. We propose a novel approach to efficiently build small and well performing networks by introducing the channel-scaling layers. By imposing L1 regularization and thresholding on the scaling weights, this framework iteratively removes unnecessary feature channels from a pre-trained model.
arXiv Detail & Related papers (2021-03-22T23:26:57Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network. PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.