Complementary Ensemble Learning
- URL: http://arxiv.org/abs/2111.08449v1
- Date: Tue, 9 Nov 2021 03:23:05 GMT
- Title: Complementary Ensemble Learning
- Authors: Hung Nguyen and Morris Chang
- Abstract summary: We derive a technique to improve performance of state-of-the-art deep learning models.
Specifically, we train auxiliary models which are able to complement state-of-the-art model uncertainty.
- Score: 1.90365714903665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To achieve high performance of a machine learning (ML) task, a deep
learning-based model must implicitly capture the entire distribution from data.
Thus, it requires a huge amount of training samples, and data are expected to
fully present the real distribution, especially for high dimensional data,
e.g., images, videos. In practice, however, data are usually collected with a
diversity of styles, and several of them have insufficient number of
representatives. This might lead to uncertainty in models' prediction, and
significantly reduce ML task performance.
In this paper, we provide a comprehensive study on this problem by looking at
model uncertainty. From this, we derive a simple but efficient technique to
improve performance of state-of-the-art deep learning models. Specifically, we
train auxiliary models which are able to complement state-of-the-art model
uncertainty. As a result, by assembling these models, we can significantly
improve the ML task performance for types of data mentioned earlier. While
slightly improving ML classification accuracy on benchmark datasets (e.g., 0.2%
on MNIST), our proposed method significantly improves on limited data (i.e.,
1.3% on Eardrum and 3.5% on ChestXray).
Related papers
- Analysis of Zero Day Attack Detection Using MLP and XAI [0.0]
This paper analyzes Machine Learning (ML) and Deep Learning (DL) based approaches to create Intrusion Detection Systems (IDS)
The focus is on using the KDD99 dataset, which has the most research done among all the datasets for detecting zero-day attacks.
We evaluate the performance of four multilayer perceptron (MLP) trained on the KDD99 dataset, including baseline ML models, weighted ML models, truncated ML models, and weighted truncated ML models.
arXiv Detail & Related papers (2025-01-28T02:20:34Z) - Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review [50.78587571704713]
Learn-Focus-Review (LFR) is a dynamic training approach that adapts to the model's learning progress.
LFR tracks the model's learning performance across data blocks (sequences of tokens) and prioritizes revisiting challenging regions of the dataset.
Compared to baseline models trained on the full datasets, LFR consistently achieved lower perplexity and higher accuracy.
arXiv Detail & Related papers (2024-09-10T00:59:18Z) - Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding [9.112203072394648]
Power-law scaling indicates that large-scale training with uniform sampling is prohibitively slow.
Active learning methods aim to increase data efficiency by prioritizing learning on the most relevant examples.
arXiv Detail & Related papers (2023-12-08T19:26:13Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Efficiently Robustify Pre-trained Models [18.392732966487582]
robustness of large scale models towards real-world settings is still a less-explored topic.
We first benchmark the performance of these models under different perturbations and datasets.
We then discuss on how complete model fine-tuning based existing robustification schemes might not be a scalable option given very large scale networks.
arXiv Detail & Related papers (2023-09-14T08:07:49Z) - Quality In / Quality Out: Data quality more relevant than model choice in anomaly detection with the UGR'16 [0.29998889086656577]
We show that relatively minor modifications on a benchmark dataset cause significantly more impact on model performance than the specific ML technique considered.
We also show that the measured model performance is uncertain, as a result of labelling inaccuracies.
arXiv Detail & Related papers (2023-05-31T12:03:12Z) - Are Sample-Efficient NLP Models More Robust? [90.54786862811183]
We investigate the relationship between sample efficiency (amount of data needed to reach a given ID accuracy) and robustness (how models fare on OOD evaluation)
We find that higher sample efficiency is only correlated with better average OOD robustness on some modeling interventions and tasks, but not others.
These results suggest that general-purpose methods for improving sample efficiency are unlikely to yield universal OOD robustness improvements, since such improvements are highly dataset- and task-dependent.
arXiv Detail & Related papers (2022-10-12T17:54:59Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Machine learning models for prediction of droplet collision outcomes [8.223798883838331]
Predicting the outcome of liquid droplet collisions is an extensively studied phenomenon.
The current physics based models for predicting the outcomes are poor.
In an ML setting this problem directly translates to a classification problem with 4 classes.
arXiv Detail & Related papers (2021-10-01T01:53:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.