Stochastic Deep Networks with Linear Competing Units for Model-Agnostic
Meta-Learning
- URL: http://arxiv.org/abs/2208.01573v1
- Date: Tue, 2 Aug 2022 16:19:54 GMT
- Title: Stochastic Deep Networks with Linear Competing Units for Model-Agnostic
Meta-Learning
- Authors: Konstantinos Kalais, Sotirios Chatzis
- Abstract summary: This work addresses meta-learning (ML) by considering deep networks with local winner-takes-all (LWTA) activations.
This type of network units results in sparse representations from each model layer, as the units are organized into blocks where only one unit generates a non-zero output.
Our approach produces state-of-the-art predictive accuracy on few-shot image classification and regression experiments, as well as reduced predictive error on an active learning setting.
- Score: 4.97235247328373
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work addresses meta-learning (ML) by considering deep networks with
stochastic local winner-takes-all (LWTA) activations. This type of network
units results in sparse representations from each model layer, as the units are
organized into blocks where only one unit generates a non-zero output. The main
operating principle of the introduced units rely on stochastic principles, as
the network performs posterior sampling over competing units to select the
winner. Therefore, the proposed networks are explicitly designed to extract
input data representations of sparse stochastic nature, as opposed to the
currently standard deterministic representation paradigm. Our approach produces
state-of-the-art predictive accuracy on few-shot image classification and
regression experiments, as well as reduced predictive error on an active
learning setting; these improvements come with an immensely reduced
computational cost.
Related papers
- LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Cross-Inferential Networks for Source-free Unsupervised Domain
Adaptation [17.718392065388503]
We propose to explore a new method called cross-inferential networks (CIN)
Our main idea is that, when we adapt the network model to predict the sample labels from encoded features, we use these prediction results to construct new training samples with derived labels.
Our experimental results on benchmark datasets demonstrate that our proposed CIN approach can significantly improve the performance of source-free UDA.
arXiv Detail & Related papers (2023-06-29T14:04:24Z) - Fitting Low-rank Models on Egocentrically Sampled Partial Networks [4.111899441919165]
We propose an approach to fit general low-rank models for egocentrically sampled networks.
This method offers the first theoretical guarantee for egocentric partial network estimation.
We evaluate the technique on several synthetic and real-world networks and show that it delivers competitive performance in link prediction tasks.
arXiv Detail & Related papers (2023-03-09T03:20:44Z) - VCNet: A self-explaining model for realistic counterfactual generation [52.77024349608834]
Counterfactual explanation is a class of methods to make local explanations of machine learning decisions.
We present VCNet-Variational Counter Net, a model architecture that combines a predictor and a counterfactual generator.
We show that VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem.
arXiv Detail & Related papers (2022-12-21T08:45:32Z) - Competing Mutual Information Constraints with Stochastic
Competition-based Activations for Learning Diversified Representations [5.981521556433909]
This work aims to address the long-established problem of learning diversified representations.
We combine information-theoretic arguments with competition-based activations.
As we experimentally show, the resulting networks yield significant discnative representation learning abilities.
arXiv Detail & Related papers (2022-01-10T20:12:13Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - Local Competition and Stochasticity for Adversarial Robustness in Deep
Learning [8.023314613846418]
This work addresses adversarial robustness in deep learning by considering deep networks with local winner-takes-all activations.
This type of network units result in sparse representations from each model layer, as the units are organized in blocks where only one unit generates a non-zero output.
arXiv Detail & Related papers (2021-01-04T17:40:52Z) - Regression Prior Networks [14.198991969107524]
Prior Networks are a newly developed class of models which yield interpretable measures of uncertainty.
They can also be used to distill an ensemble of models via Ensemble Distribution Distillation (EnD$2$)
This work extends Prior Networks and EnD$2$ to regression tasks by considering the Normal-Wishart distribution.
arXiv Detail & Related papers (2020-06-20T14:50:14Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - Resolution Adaptive Networks for Efficient Inference [53.04907454606711]
We propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs.
In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations.
High-resolution paths in the network maintain the capability to recognize the "hard" samples.
arXiv Detail & Related papers (2020-03-16T16:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.