Complementary Ensemble Learning
- URL: http://arxiv.org/abs/2111.08449v1
- Date: Tue, 9 Nov 2021 03:23:05 GMT
- Title: Complementary Ensemble Learning
- Authors: Hung Nguyen and Morris Chang
- Abstract summary: We derive a technique to improve performance of state-of-the-art deep learning models.
Specifically, we train auxiliary models which are able to complement state-of-the-art model uncertainty.
- Score: 1.90365714903665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To achieve high performance of a machine learning (ML) task, a deep
learning-based model must implicitly capture the entire distribution from data.
Thus, it requires a huge amount of training samples, and data are expected to
fully present the real distribution, especially for high dimensional data,
e.g., images, videos. In practice, however, data are usually collected with a
diversity of styles, and several of them have insufficient number of
representatives. This might lead to uncertainty in models' prediction, and
significantly reduce ML task performance.
In this paper, we provide a comprehensive study on this problem by looking at
model uncertainty. From this, we derive a simple but efficient technique to
improve performance of state-of-the-art deep learning models. Specifically, we
train auxiliary models which are able to complement state-of-the-art model
uncertainty. As a result, by assembling these models, we can significantly
improve the ML task performance for types of data mentioned earlier. While
slightly improving ML classification accuracy on benchmark datasets (e.g., 0.2%
on MNIST), our proposed method significantly improves on limited data (i.e.,
1.3% on Eardrum and 3.5% on ChestXray).
Related papers
- Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Bad Students Make Great Teachers: Active Learning Accelerates
Large-Scale Visual Understanding [9.655434542591815]
Power-law scaling indicates that large-scale training with uniform sampling is prohibitively slow.
Active learning methods aim to increase data efficiency by prioritizing learning on the most relevant examples.
arXiv Detail & Related papers (2023-12-08T19:26:13Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Efficiently Robustify Pre-trained Models [18.392732966487582]
robustness of large scale models towards real-world settings is still a less-explored topic.
We first benchmark the performance of these models under different perturbations and datasets.
We then discuss on how complete model fine-tuning based existing robustification schemes might not be a scalable option given very large scale networks.
arXiv Detail & Related papers (2023-09-14T08:07:49Z) - Are Sample-Efficient NLP Models More Robust? [90.54786862811183]
We investigate the relationship between sample efficiency (amount of data needed to reach a given ID accuracy) and robustness (how models fare on OOD evaluation)
We find that higher sample efficiency is only correlated with better average OOD robustness on some modeling interventions and tasks, but not others.
These results suggest that general-purpose methods for improving sample efficiency are unlikely to yield universal OOD robustness improvements, since such improvements are highly dataset- and task-dependent.
arXiv Detail & Related papers (2022-10-12T17:54:59Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Sparse MoEs meet Efficient Ensembles [49.313497379189315]
We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs)
We present Efficient Ensemble of Experts (E$3$), a scalable and simple ensemble of sparse MoEs that takes the best of both classes of models, while using up to 45% fewer FLOPs than a deep ensemble.
arXiv Detail & Related papers (2021-10-07T11:58:35Z) - Machine learning models for prediction of droplet collision outcomes [8.223798883838331]
Predicting the outcome of liquid droplet collisions is an extensively studied phenomenon.
The current physics based models for predicting the outcomes are poor.
In an ML setting this problem directly translates to a classification problem with 4 classes.
arXiv Detail & Related papers (2021-10-01T01:53:09Z) - SSSE: Efficiently Erasing Samples from Trained Machine Learning Models [103.43466657962242]
We propose an efficient and effective algorithm, SSSE, for samples erasure.
In certain cases SSSE can erase samples almost as well as the optimal, yet impractical, gold standard of training a new model from scratch with only the permitted data.
arXiv Detail & Related papers (2021-07-08T14:17:24Z) - Certifiable Machine Unlearning for Linear Models [1.484852576248587]
Machine unlearning is the task of updating machine learning (ML) models after a subset of the training data they were trained on is deleted.
We present an experimental study of the three state-of-the-art approximate unlearning methods for linear models.
arXiv Detail & Related papers (2021-06-29T05:05:58Z) - Injective Domain Knowledge in Neural Networks for Transprecision
Computing [17.300144121921882]
This paper studies the improvements that can be obtained by integrating prior knowledge when dealing with a non-trivial learning task.
The results clearly show that ML models exploiting problem-specific information outperform the purely data-driven ones, with an average accuracy improvement around 38%.
arXiv Detail & Related papers (2020-02-24T12:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.