Related papers: Complementary Ensemble Learning

Complementary Ensemble Learning

URL: http://arxiv.org/abs/2111.08449v1
Date: Tue, 9 Nov 2021 03:23:05 GMT
Title: Complementary Ensemble Learning
Authors: Hung Nguyen and Morris Chang
Abstract summary: We derive a technique to improve performance of state-of-the-art deep learning models. Specifically, we train auxiliary models which are able to complement state-of-the-art model uncertainty.
Score: 1.90365714903665
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To achieve high performance of a machine learning (ML) task, a deep learning-based model must implicitly capture the entire distribution from data. Thus, it requires a huge amount of training samples, and data are expected to fully present the real distribution, especially for high dimensional data, e.g., images, videos. In practice, however, data are usually collected with a diversity of styles, and several of them have insufficient number of representatives. This might lead to uncertainty in models' prediction, and significantly reduce ML task performance. In this paper, we provide a comprehensive study on this problem by looking at model uncertainty. From this, we derive a simple but efficient technique to improve performance of state-of-the-art deep learning models. Specifically, we train auxiliary models which are able to complement state-of-the-art model uncertainty. As a result, by assembling these models, we can significantly improve the ML task performance for types of data mentioned earlier. While slightly improving ML classification accuracy on benchmark datasets (e.g., 0.2% on MNIST), our proposed method significantly improves on limited data (i.e., 1.3% on Eardrum and 3.5% on ChestXray).

Related papers

Approximating Language Model Training Data from Weights [70.08614275061689]
We formalize the problem of data approximation from model weights and propose several baselines and metrics.<n>We develop a gradient-based approach that selects the highest-matching data from a large public text corpus.<n>Even when none of the true training data is known, our method is able to locate a small subset of public Web documents.
arXiv Detail & Related papers (2025-06-18T15:26:43Z)
Learning Accurate Models on Incomplete Data with Minimal Imputation [2.5586124684627274]
Missing data often exists in real-world datasets, requiring significant time and effort for imputation to learn accurate machine learning (ML) models. We introduce the concept of minimal data imputation, which ensures accurate ML models trained over the imputed dataset.
arXiv Detail & Related papers (2025-03-18T05:36:59Z)
Analysis of Zero Day Attack Detection Using MLP and XAI [0.0]
This paper analyzes Machine Learning (ML) and Deep Learning (DL) based approaches to create Intrusion Detection Systems (IDS) The focus is on using the KDD99 dataset, which has the most research done among all the datasets for detecting zero-day attacks. We evaluate the performance of four multilayer perceptron (MLP) trained on the KDD99 dataset, including baseline ML models, weighted ML models, truncated ML models, and weighted truncated ML models.
arXiv Detail & Related papers (2025-01-28T02:20:34Z)
Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest. Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z)
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data. We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z)
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review [50.78587571704713]
Learn-Focus-Review (LFR) is a dynamic training approach that adapts to the model's learning progress. LFR tracks the model's learning performance across data blocks (sequences of tokens) and prioritizes revisiting challenging regions of the dataset. Compared to baseline models trained on the full datasets, LFR consistently achieved lower perplexity and higher accuracy.
arXiv Detail & Related papers (2024-09-10T00:59:18Z)
Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios. We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples. Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z)
Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding [9.112203072394648]
Power-law scaling indicates that large-scale training with uniform sampling is prohibitively slow. Active learning methods aim to increase data efficiency by prioritizing learning on the most relevant examples.
arXiv Detail & Related papers (2023-12-08T19:26:13Z)
Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks. Such models tend to be large and require commensurate volumes of training data. It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs. Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z)
Efficiently Robustify Pre-trained Models [18.392732966487582]
robustness of large scale models towards real-world settings is still a less-explored topic. We first benchmark the performance of these models under different perturbations and datasets. We then discuss on how complete model fine-tuning based existing robustification schemes might not be a scalable option given very large scale networks.
arXiv Detail & Related papers (2023-09-14T08:07:49Z)
Quality In / Quality Out: Data quality more relevant than model choice in anomaly detection with the UGR'16 [0.29998889086656577]
We show that relatively minor modifications on a benchmark dataset cause significantly more impact on model performance than the specific ML technique considered. We also show that the measured model performance is uncertain, as a result of labelling inaccuracies.
arXiv Detail & Related papers (2023-05-31T12:03:12Z)
Are Sample-Efficient NLP Models More Robust? [90.54786862811183]
We investigate the relationship between sample efficiency (amount of data needed to reach a given ID accuracy) and robustness (how models fare on OOD evaluation) We find that higher sample efficiency is only correlated with better average OOD robustness on some modeling interventions and tasks, but not others. These results suggest that general-purpose methods for improving sample efficiency are unlikely to yield universal OOD robustness improvements, since such improvements are highly dataset- and task-dependent.
arXiv Detail & Related papers (2022-10-12T17:54:59Z)
Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks. Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients. We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z)
Machine learning models for prediction of droplet collision outcomes [8.223798883838331]
Predicting the outcome of liquid droplet collisions is an extensively studied phenomenon. The current physics based models for predicting the outcomes are poor. In an ML setting this problem directly translates to a classification problem with 4 classes.
arXiv Detail & Related papers (2021-10-01T01:53:09Z)
Certifiable Machine Unlearning for Linear Models [1.484852576248587]
Machine unlearning is the task of updating machine learning (ML) models after a subset of the training data they were trained on is deleted. We present an experimental study of the three state-of-the-art approximate unlearning methods for linear models.
arXiv Detail & Related papers (2021-06-29T05:05:58Z)
Injective Domain Knowledge in Neural Networks for Transprecision Computing [17.300144121921882]
This paper studies the improvements that can be obtained by integrating prior knowledge when dealing with a non-trivial learning task. The results clearly show that ML models exploiting problem-specific information outperform the purely data-driven ones, with an average accuracy improvement around 38%.
arXiv Detail & Related papers (2020-02-24T12:58:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.