Contextual Classification Using Self-Supervised Auxiliary Models for
Deep Neural Networks
- URL: http://arxiv.org/abs/2101.03057v1
- Date: Thu, 7 Jan 2021 18:41:16 GMT
- Title: Contextual Classification Using Self-Supervised Auxiliary Models for
Deep Neural Networks
- Authors: Sebastian Palacio, Philipp Engler, J\"orn Hees, Andreas Dengel
- Abstract summary: We introduce the notion of Self-Supervised Autogenous Learning (SSAL) models.
A SSAL objective is realized through one or more additional targets that are derived from the original supervised classification task.
We show that SSAL models consistently outperform the state-of-the-art while also providing structured predictions that are more interpretable.
- Score: 6.585049648605185
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Classification problems solved with deep neural networks (DNNs) typically
rely on a closed world paradigm, and optimize over a single objective (e.g.,
minimization of the cross-entropy loss). This setup dismisses all kinds of
supporting signals that can be used to reinforce the existence or absence of a
particular pattern. The increasing need for models that are interpretable by
design makes the inclusion of said contextual signals a crucial necessity. To
this end, we introduce the notion of Self-Supervised Autogenous Learning (SSAL)
models. A SSAL objective is realized through one or more additional targets
that are derived from the original supervised classification task, following
architectural principles found in multi-task learning. SSAL branches impose
low-level priors into the optimization process (e.g., grouping). The ability of
using SSAL branches during inference, allow models to converge faster, focusing
on a richer set of class-relevant features. We show that SSAL models
consistently outperform the state-of-the-art while also providing structured
predictions that are more interpretable.
Related papers
- LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging [80.17238673443127]
LiNeS is a post-training editing technique designed to preserve pre-trained generalization while enhancing fine-tuned task performance.
LiNeS demonstrates significant improvements in both single-task and multi-task settings across various benchmarks in vision and natural language processing.
arXiv Detail & Related papers (2024-10-22T16:26:05Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - A Bayesian Unification of Self-Supervised Clustering and Energy-Based
Models [11.007541337967027]
We perform a Bayesian analysis of state-of-the-art self-supervised learning objectives.
We show that our objective function allows to outperform existing self-supervised learning strategies.
We also demonstrate that GEDI can be integrated into a neuro-symbolic framework.
arXiv Detail & Related papers (2023-12-30T04:46:16Z) - Mitigate Domain Shift by Primary-Auxiliary Objectives Association for
Generalizing Person ReID [39.98444065846305]
ReID models struggle in learning domain-invariant representation solely through training on an instance classification objective.
We introduce a method that guides model learning of the primary ReID instance classification objective by a concurrent auxiliary learning objective on weakly labeled pedestrian saliency detection.
Our model can be extended with the recent test-time diagram to form the PAOA+, which performs on-the-fly optimization against the auxiliary objective.
arXiv Detail & Related papers (2023-10-24T15:15:57Z) - Towards Disentangling Information Paths with Coded ResNeXt [11.884259630414515]
We take a novel approach to enhance the transparency of the function of the whole network.
We propose a neural network architecture for classification, in which the information that is relevant to each class flows through specific paths.
arXiv Detail & Related papers (2022-02-10T21:45:49Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - The Self-Simplifying Machine: Exploiting the Structure of Piecewise
Linear Neural Networks to Create Interpretable Models [0.0]
We introduce novel methodology toward simplification and increased interpretability of Piecewise Linear Neural Networks for classification tasks.
Our methods include the use of a trained, deep network to produce a well-performing, single-hidden-layer network without further training.
On these methods, we conduct preliminary studies of model performance, as well as a case study on Wells Fargo's Home Lending dataset.
arXiv Detail & Related papers (2020-12-02T16:02:14Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.