Related papers: CODA: Constructivism Learning for Instance-Dependent Dropout Architecture Construction

CODA: Constructivism Learning for Instance-Dependent Dropout Architecture Construction

URL: http://arxiv.org/abs/2106.08444v1
Date: Tue, 15 Jun 2021 21:32:28 GMT
Title: CODA: Constructivism Learning for Instance-Dependent Dropout Architecture Construction
Authors: Xiaoli Li
Abstract summary: We propose Constructivism learning for instance-dependent Dropout Architecture (CODA) Based on the theory we have designed a better drop out technique, Uniform Process Mixture Models. We have evaluated our proposed method on 5 real-world datasets and compared the performance with other state-of-the-art dropout techniques.
Score: 3.2238887070637805
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Dropout is attracting intensive research interest in deep learning as an efficient approach to prevent overfitting. Recently incorporating structural information when deciding which units to drop out produced promising results comparing to methods that ignore the structural information. However, a major issue of the existing work is that it failed to differentiate among instances when constructing the dropout architecture. This can be a significant deficiency for many applications. To solve this issue, we propose Constructivism learning for instance-dependent Dropout Architecture (CODA), which is inspired from a philosophical theory, constructivism learning. Specially, based on the theory we have designed a better drop out technique, Uniform Process Mixture Models, using a Bayesian nonparametric method Uniform process. We have evaluated our proposed method on 5 real-world datasets and compared the performance with other state-of-the-art dropout techniques. The experimental results demonstrated the effectiveness of CODA.

Related papers

LT-DARTS: An Architectural Approach to Enhance Deep Long-Tailed Learning [5.214135587370722]
We introduce Long-Tailed Differential Architecture Search (LT-DARTS) We conduct extensive experiments to explore architectural components that demonstrate better performance on long-tailed data. This ensures that the architecture obtained through our search process incorporates superior components.
arXiv Detail & Related papers (2024-11-09T07:19:56Z)
Towards Robust Out-of-Distribution Generalization: Data Augmentation and Neural Architecture Search Approaches [4.577842191730992]
We study ways toward robust OoD generalization for deep learning. We first propose a novel and effective approach to disentangle the spurious correlation between features that are not essential for recognition. We then study the problem of strengthening neural architecture search in OoD scenarios.
arXiv Detail & Related papers (2024-10-25T20:50:32Z)
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? [60.50127555651554]
Large Language Models (LLMs) show impressive results in numerous practical applications, but they lack essential safety features. This makes them vulnerable to manipulations such as indirect prompt injections and generally unsuitable for safety-critical tasks. We introduce a formal measure for instruction-data separation and an empirical variant that is calculable from a model's outputs.
arXiv Detail & Related papers (2024-03-11T15:48:56Z)
Depth-agnostic Single Image Dehazing [12.51359372069387]
We propose a simple yet novel synthetic method to decouple the relationship between haze density and scene depth, by which a depth-agnostic dataset (DA-HAZE) is generated. Experiments indicate that models trained on DA-HAZE achieve significant improvements on real-world benchmarks, with less discrepancy between SOTS and DA-SOTS. We revisit the U-Net-based architectures for dehazing, in which dedicatedly designed blocks are incorporated.
arXiv Detail & Related papers (2024-01-14T06:33:11Z)
Boosting the Cross-Architecture Generalization of Dataset Distillation through an Empirical Study [52.83643622795387]
Cross-architecture generalization of dataset distillation weakens its practical significance. We propose a novel method of EvaLuation with distillation Feature (ELF) By performing extensive experiments, we successfully prove that ELF can well enhance the cross-architecture generalization of current DD methods.
arXiv Detail & Related papers (2023-12-09T15:41:42Z)
Lightweight Diffusion Models with Distillation-Based Block Neural Architecture Search [55.41583104734349]
We propose to automatically remove structural redundancy in diffusion models with our proposed Diffusion Distillation-based Block-wise Neural Architecture Search (NAS) Given a larger pretrained teacher, we leverage DiffNAS to search for the smallest architecture which can achieve on-par or even better performance than the teacher. Different from previous block-wise NAS methods, DiffNAS contains a block-wise local search strategy and a retraining strategy with a joint dynamic loss.
arXiv Detail & Related papers (2023-11-08T12:56:59Z)
One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation [69.65734716679925]
Knowledge distillation has proven to be a highly effective approach for enhancing model performance through a teacher-student training scheme. Most existing distillation methods are designed under the assumption that the teacher and student models belong to the same model family. We propose a simple yet effective one-for-all KD framework called OFA-KD, which significantly improves the distillation performance between heterogeneous architectures.
arXiv Detail & Related papers (2023-10-30T11:13:02Z)
A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies. Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance. Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z)
A Survey on Dropout Methods and Experimental Verification in Recommendation [34.557554809126415]
Overfitting is a common problem in machine learning, which means the model too closely fits the training data while performing poorly in the test data. Among various methods of coping with overfitting, dropout is one of the representative ways. From randomly dropping neurons to dropping neural structures, dropout has achieved great success in improving model performances.
arXiv Detail & Related papers (2022-04-05T07:08:21Z)
Efficient Sub-structured Knowledge Distillation [52.5931565465661]
We propose an approach that is much simpler in its formulation and far more efficient for training than existing approaches. We transfer the knowledge from a teacher model to its student model by locally matching their predictions on all sub-structures, instead of the whole output space.
arXiv Detail & Related papers (2022-03-09T15:56:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.