RLBoost: Boosting Supervised Models using Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2305.14115v1
- Date: Tue, 23 May 2023 14:38:33 GMT
- Title: RLBoost: Boosting Supervised Models using Deep Reinforcement Learning
- Authors: Eloy Anguiano Batanero, \'Angela Fern\'andez Pascual, \'Alvaro Barbero
Jim\'enez
- Abstract summary: We present RLBoost, an algorithm that uses deep reinforcement learning strategies to evaluate a particular dataset and obtain a model capable of estimating the quality of any new data.
The results of the article show that this model obtains better and more stable results than other state-of-the-art algorithms such as LOO, DataShapley or DVRL.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data quality or data evaluation is sometimes a task as important as
collecting a large volume of data when it comes to generating accurate
artificial intelligence models. In fact, being able to evaluate the data can
lead to a larger database that is better suited to a particular problem because
we have the ability to filter out data obtained automatically of dubious
quality. In this paper we present RLBoost, an algorithm that uses deep
reinforcement learning strategies to evaluate a particular dataset and obtain a
model capable of estimating the quality of any new data in order to improve the
final predictive quality of a supervised learning model. This solution has the
advantage that of being agnostic regarding the supervised model used and,
through multi-attention strategies, takes into account the data in its context
and not only individually. The results of the article show that this model
obtains better and more stable results than other state-of-the-art algorithms
such as LOO, DataShapley or DVRL.
Related papers
- Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - CHG Shapley: Efficient Data Valuation and Selection towards Trustworthy Machine Learning [0.0]
We propose CHG Shapley, which approximates the utility of each data subset on model accuracy during a single model training.
We employ CHG Shapley for real-time data selection, demonstrating its effectiveness in identifying high-value and noisy data.
arXiv Detail & Related papers (2024-06-17T16:48:31Z) - Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts.
Existing approaches require re-training models on different data subsets, which is computationally intensive.
This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z) - Improving Language Model Reasoning with Self-motivated Learning [60.779625789039486]
textitSelf-motivated Learning framework motivates the model itself to automatically generate rationales on existing datasets.
We train a reward model with the rank to evaluate the quality of rationales, and improve the performance of reasoning through reinforcement learning.
arXiv Detail & Related papers (2024-04-10T14:05:44Z) - LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection.
We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks.
Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - A Proposal to Study "Is High Quality Data All We Need?" [8.122270502556374]
We propose an empirical study that examines how to select a subset of and/or create high quality benchmark data.
We seek to answer if big datasets are truly needed to learn a task, and whether a smaller subset of high quality data can replace big datasets.
arXiv Detail & Related papers (2022-03-12T10:50:13Z) - Data Excellence for AI: Why Should You Care [9.421161233914251]
Benchmark datasets define the entire world within which models exist and operate.
If "data is the new oil," we are still missing work on the refineries by which the data itself could be optimized for more effective use.
arXiv Detail & Related papers (2021-11-19T19:06:03Z) - Exploring the Efficacy of Automatically Generated Counterfactuals for
Sentiment Analysis [17.811597734603144]
We propose an approach to automatically generating counterfactual data for data augmentation and explanation.
A comprehensive evaluation on several different datasets and using a variety of state-of-the-art benchmarks demonstrate how our approach can achieve significant improvements in model performance.
arXiv Detail & Related papers (2021-06-29T10:27:01Z) - How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance.
We formulate a quality measure for the data set, which we refer to as $rho$-gap.
We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z) - DQI: Measuring Data Quality in NLP [22.54066527822898]
We introduce a generic formula for Data Quality Index (DQI) to help dataset creators create datasets free of unwanted biases.
We show that models trained on the renovated SNLI dataset generalize better to out of distribution tasks.
arXiv Detail & Related papers (2020-05-02T12:34:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.