Complementary Ensemble Learning
- URL: http://arxiv.org/abs/2111.08449v1
- Date: Tue, 9 Nov 2021 03:23:05 GMT
- Title: Complementary Ensemble Learning
- Authors: Hung Nguyen and Morris Chang
- Abstract summary: We derive a technique to improve performance of state-of-the-art deep learning models.
Specifically, we train auxiliary models which are able to complement state-of-the-art model uncertainty.
- Score: 1.90365714903665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To achieve high performance of a machine learning (ML) task, a deep
learning-based model must implicitly capture the entire distribution from data.
Thus, it requires a huge amount of training samples, and data are expected to
fully present the real distribution, especially for high dimensional data,
e.g., images, videos. In practice, however, data are usually collected with a
diversity of styles, and several of them have insufficient number of
representatives. This might lead to uncertainty in models' prediction, and
significantly reduce ML task performance.
In this paper, we provide a comprehensive study on this problem by looking at
model uncertainty. From this, we derive a simple but efficient technique to
improve performance of state-of-the-art deep learning models. Specifically, we
train auxiliary models which are able to complement state-of-the-art model
uncertainty. As a result, by assembling these models, we can significantly
improve the ML task performance for types of data mentioned earlier. While
slightly improving ML classification accuracy on benchmark datasets (e.g., 0.2%
on MNIST), our proposed method significantly improves on limited data (i.e.,
1.3% on Eardrum and 3.5% on ChestXray).
Related papers
- Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding [9.112203072394648]
Power-law scaling indicates that large-scale training with uniform sampling is prohibitively slow.
Active learning methods aim to increase data efficiency by prioritizing learning on the most relevant examples.
arXiv Detail & Related papers (2023-12-08T19:26:13Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Efficiently Robustify Pre-trained Models [18.392732966487582]
robustness of large scale models towards real-world settings is still a less-explored topic.
We first benchmark the performance of these models under different perturbations and datasets.
We then discuss on how complete model fine-tuning based existing robustification schemes might not be a scalable option given very large scale networks.
arXiv Detail & Related papers (2023-09-14T08:07:49Z) - Are Sample-Efficient NLP Models More Robust? [90.54786862811183]
We investigate the relationship between sample efficiency (amount of data needed to reach a given ID accuracy) and robustness (how models fare on OOD evaluation)
We find that higher sample efficiency is only correlated with better average OOD robustness on some modeling interventions and tasks, but not others.
These results suggest that general-purpose methods for improving sample efficiency are unlikely to yield universal OOD robustness improvements, since such improvements are highly dataset- and task-dependent.
arXiv Detail & Related papers (2022-10-12T17:54:59Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Machine learning models for prediction of droplet collision outcomes [8.223798883838331]
Predicting the outcome of liquid droplet collisions is an extensively studied phenomenon.
The current physics based models for predicting the outcomes are poor.
In an ML setting this problem directly translates to a classification problem with 4 classes.
arXiv Detail & Related papers (2021-10-01T01:53:09Z) - Certifiable Machine Unlearning for Linear Models [1.484852576248587]
Machine unlearning is the task of updating machine learning (ML) models after a subset of the training data they were trained on is deleted.
We present an experimental study of the three state-of-the-art approximate unlearning methods for linear models.
arXiv Detail & Related papers (2021-06-29T05:05:58Z) - Injective Domain Knowledge in Neural Networks for Transprecision
Computing [17.300144121921882]
This paper studies the improvements that can be obtained by integrating prior knowledge when dealing with a non-trivial learning task.
The results clearly show that ML models exploiting problem-specific information outperform the purely data-driven ones, with an average accuracy improvement around 38%.
arXiv Detail & Related papers (2020-02-24T12:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.