Related papers: Routing the Lottery: Adaptive Subnetworks for Heterogeneous Data

Routing the Lottery: Adaptive Subnetworks for Heterogeneous Data

URL: http://arxiv.org/abs/2601.22141v1
Date: Thu, 29 Jan 2026 18:56:41 GMT
Title: Routing the Lottery: Adaptive Subnetworks for Heterogeneous Data
Authors: Grzegorz Stefanski, Alberto Presta, Michal Byra,
Abstract summary: Lottery Ticket Hypothesis posits that large networks contain sparseworks, or winning tickets, that can be trained in isolation to match the performance of their dense counterparts.<n>We propose the Routing Lottery (RTL), an adaptive pruning framework that discovers multiple specialized pruningworks, called adaptive tickets, each tailored to a class, cluster semantic, or environmental condition.<n>Our results recast pruning as a mechanism for aligning model structure with data heterogeneity, paving the way toward more modular and context-aware deep learning.
Score: 2.5157688901171995
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In pruning, the Lottery Ticket Hypothesis posits that large networks contain sparse subnetworks, or winning tickets, that can be trained in isolation to match the performance of their dense counterparts. However, most existing approaches assume a single universal winning ticket shared across all inputs, ignoring the inherent heterogeneity of real-world data. In this work, we propose Routing the Lottery (RTL), an adaptive pruning framework that discovers multiple specialized subnetworks, called adaptive tickets, each tailored to a class, semantic cluster, or environmental condition. Across diverse datasets and tasks, RTL consistently outperforms single- and multi-model baselines in balanced accuracy and recall, while using up to 10 times fewer parameters than independent models and exhibiting semantically aligned. Furthermore, we identify subnetwork collapse, a performance drop under aggressive pruning, and introduce a subnetwork similarity score that enables label-free diagnosis of oversparsification. Overall, our results recast pruning as a mechanism for aligning model structure with data heterogeneity, paving the way toward more modular and context-aware deep learning.

Related papers

RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging [33.22889542330089]
Internal representations in large language models (LLMs) serve as reliable proxies of learned knowledge.<n>We propose RECALL, a representation-aware model merging framework for continual learning without access to historical data.
arXiv Detail & Related papers (2025-10-23T12:17:37Z)
Disentangled Lottery Tickets: Identifying and Assembling Core and Specialist Subnetworks [0.2730969268472861]
Lottery Ticket Hypothesis suggests that within large neural networks, there exist sparse, trainable "winning tickets"<n>This paper proposes the Disentangled Lottery Ticket (DiLT) Hypothesis, which posits that the intersection mask represents a universal, task-agnostic "core" subnetwork.<n>Experiments on ImageNet and fine-grained datasets such as Stanford Cars, using ResNet and Vision Transformer architectures, show that the "core" ticket provides superior transfer learning performance, the "specialist" tickets retain domain-specific features enabling modular assembly, and the full re-assembled "union" ticket outperforms COLT.
arXiv Detail & Related papers (2025-08-23T06:24:15Z)
Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection [89.42023974249122]
Adapt-$infty$ is a new multi-way and adaptive data selection approach for lifelong instruction tuning.<n>We construct pseudo-skill clusters by grouping gradient-based sample vectors.<n>We select the best-performing data selector for each skill cluster from a pool of selector experts.<n>This data selector samples a subset of the most important samples from each skill cluster for training.
arXiv Detail & Related papers (2024-10-14T15:48:09Z)
Flexible inference in heterogeneous and attributed multilayer networks [21.349513661012498]
We develop a probabilistic generative model to perform inference in multilayer networks with arbitrary types of information.<n>We demonstrate its ability to unveil a variety of patterns in a social support network among villagers in rural India.
arXiv Detail & Related papers (2024-05-31T15:21:59Z)
Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity. In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark. We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z)
Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation. In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN. Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z)
Learning Prototype-oriented Set Representations for Meta-Learning [85.19407183975802]
Learning from set-structured data is a fundamental problem that has recently attracted increasing attention. This paper provides a novel optimal transport based way to improve existing summary networks. We further instantiate it to the cases of few-shot classification and implicit meta generative modeling.
arXiv Detail & Related papers (2021-10-18T09:49:05Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network Representation Learning [52.83948119677194]
We propose a multi-semantic metapath (MSM) model for large scale heterogeneous representation learning. Specifically, we generate multi-semantic metapath-based random walks to construct the heterogeneous neighborhood to handle the unbalanced distributions. We conduct systematical evaluations for the proposed framework on two challenging datasets: Amazon and Alibaba.
arXiv Detail & Related papers (2020-07-19T22:50:20Z)
Ensembled sparse-input hierarchical networks for high-dimensional datasets [8.629912408966145]
We show that dense neural networks can be a practical data analysis tool in settings with small sample sizes. A proposed method appropriately prunes the network structure by tuning only two L1-penalty parameters. On a collection of real-world datasets with different sizes, EASIER-net selected network architectures in a data-adaptive manner and achieved higher prediction accuracy than off-the-shelf methods on average.
arXiv Detail & Related papers (2020-05-11T02:08:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.