Structured Output Regularization: a framework for few-shot transfer learning
- URL: http://arxiv.org/abs/2510.08728v1
- Date: Thu, 09 Oct 2025 18:34:22 GMT
- Title: Structured Output Regularization: a framework for few-shot transfer learning
- Authors: Nicolas Ewen, Jairo Diaz-Rodriguez, Kelly Ramsay,
- Abstract summary: Traditional transfer learning typically reuses large pre-trained networks by freezing some of their weights and adding task-specific layers.<n>We propose Structured Output Regularization (SOR), a simple yet effective framework that freezes the internal network structures.<n>This framework tailors the model to specific data with minimal additional parameters and is easily applicable to various network components.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional transfer learning typically reuses large pre-trained networks by freezing some of their weights and adding task-specific layers. While this approach is computationally efficient, it limits the model's ability to adapt to domain-specific features and can still lead to overfitting with very limited data. To address these limitations, we propose Structured Output Regularization (SOR), a simple yet effective framework that freezes the internal network structures (e.g., convolutional filters) while using a combination of group lasso and $L_1$ penalties. This framework tailors the model to specific data with minimal additional parameters and is easily applicable to various network components, such as convolutional filters or various blocks in neural networks enabling broad applicability for transfer learning tasks. We evaluate SOR on three few shot medical imaging classification tasks and we achieve competitive results using DenseNet121, and EfficientNetB4 bases compared to established benchmarks.
Related papers
- TuckA: Hierarchical Compact Tensor Experts for Efficient Fine-Tuning [83.93651411533533]
We introduce Tucker Adaptation (TuckA), a method with four key properties.<n>We develop an efficient batch-level routing mechanism, which reduces the router's parameter size by a factor of $L$.<n>Experiments on benchmarks in natural language understanding, image classification, and mathematical reasoning speak to the efficacy of TuckA.
arXiv Detail & Related papers (2025-11-10T09:03:16Z) - Deep Hierarchical Learning with Nested Subspace Networks [53.71337604556311]
We propose Nested Subspace Networks (NSNs) for large neural networks.<n>NSNs enable a single model to be dynamically and granularly adjusted across a continuous spectrum of compute budgets.<n>We show that NSNs can be surgically applied to pre-trained LLMs and unlock a smooth and predictable compute-performance frontier.
arXiv Detail & Related papers (2025-09-22T15:13:14Z) - Residual Kolmogorov-Arnold Network for Enhanced Deep Learning [0.9399249626168465]
Deep convolutional neural networks (CNNs) can be difficult to optimize and costly to train due to hundreds of layers within the network depth.<n>We introduce a "plug-in" module, called Residual Kolmogorov-Arnold Network (RKAN)<n>RKAN offers consistent improvements over baseline models in different vision tasks.
arXiv Detail & Related papers (2024-10-07T21:12:32Z) - RTF-Q: Efficient Unsupervised Domain Adaptation with Retraining-free Quantization [14.447148108341688]
We propose efficient unsupervised domain adaptation with ReTraining-Free Quantization (RTF-Q)
Our approach uses low-precision quantization architectures with varying computational costs, adapting to devices with dynamic budgets.
We demonstrate that our network achieves competitive accuracy with state-of-the-art methods across three benchmarks.
arXiv Detail & Related papers (2024-08-11T11:53:29Z) - Conditional Information Gain Trellis [1.290382979353427]
Conditional computing processes an input using only part of the neural network's computational units.
We use a Trellis-based approach for generating specific execution paths in a deep convolutional neural network.
We show that our conditional execution mechanism achieves comparable or better model performance compared to unconditional baselines.
arXiv Detail & Related papers (2024-02-13T10:23:45Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Deep Dependency Networks for Multi-Label Classification [24.24496964886951]
We show that the performance of previous approaches that combine Markov Random Fields with neural networks can be modestly improved.
We propose a new modeling framework called deep dependency networks, which augments a dependency network.
Despite its simplicity, jointly learning this new architecture yields significant improvements in performance.
arXiv Detail & Related papers (2023-02-01T17:52:40Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Unsupervised Domain-adaptive Hash for Networks [81.49184987430333]
Domain-adaptive hash learning has enjoyed considerable success in the computer vision community.
We develop an unsupervised domain-adaptive hash learning method for networks, dubbed UDAH.
arXiv Detail & Related papers (2021-08-20T12:09:38Z) - Smoother Network Tuning and Interpolation for Continuous-level Image
Processing [7.730087303035803]
Filter Transition Network (FTN) is a structurally smoother module for continuous-level learning.
FTN generalizes well across various tasks and networks and cause fewer undesirable side effects.
For stable learning of FTN, we additionally propose a method to non-linear neural network layers with identity mappings.
arXiv Detail & Related papers (2020-10-05T18:29:52Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.