SelectIT: Selective Instruction Tuning for Large Language Models via
Uncertainty-Aware Self-Reflection
- URL: http://arxiv.org/abs/2402.16705v1
- Date: Mon, 26 Feb 2024 16:21:53 GMT
- Title: SelectIT: Selective Instruction Tuning for Large Language Models via
Uncertainty-Aware Self-Reflection
- Authors: Liangxin Liu, Xuebo Liu, Derek F. Wong, Dongfang Li, Ziyi Wang,
Baotian Hu, Min Zhang
- Abstract summary: In this work, we propose a novel approach, termed SelectIT, that capitalizes on the foundational capabilities of the large language models (LLMs)
Specifically, we exploit the intrinsic uncertainty present in LLMs to more effectively select high-quality IT data, without the need for extra resources.
Empirical results demonstrate that IT using Selective Alpaca leads to substantial model ability enhancement.
- Score: 49.54657248221432
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Instruction tuning (IT) is crucial to tailoring large language models (LLMs)
towards human-centric interactions. Recent advancements have shown that the
careful selection of a small, high-quality subset of IT data can significantly
enhance the performance of LLMs. Despite this, common approaches often rely on
additional models or data sets, which increases costs and limits widespread
adoption. In this work, we propose a novel approach, termed SelectIT, that
capitalizes on the foundational capabilities of the LLM itself. Specifically,
we exploit the intrinsic uncertainty present in LLMs to more effectively select
high-quality IT data, without the need for extra resources. Furthermore, we
introduce a novel IT dataset, the Selective Alpaca, created by applying
SelectIT to the Alpaca-GPT4 dataset. Empirical results demonstrate that IT
using Selective Alpaca leads to substantial model ability enhancement. The
robustness of SelectIT has also been corroborated in various foundation models
and domain-specific tasks. Our findings suggest that longer and more
computationally intensive IT data may serve as superior sources of IT, offering
valuable insights for future research in this area. Data, code, and scripts are
freely available at https://github.com/Blue-Raincoat/SelectIT.
Related papers
- Towards LLM-guided Efficient and Interpretable Multi-linear Tensor Network Rank Selection [2.06242362470764]
We propose a novel framework to guide the rank selection in tensor network models for higher-order data analysis.
By utilising the intrinsic reasoning capabilities and domain knowledge of LLMs, our approach offers enhanced interpretability of the rank choices.
This work is placed at the intersection of large language models and higher-order data analysis.
arXiv Detail & Related papers (2024-10-14T17:09:14Z) - Exploring Large Language Models for Feature Selection: A Data-centric Perspective [17.99621520553622]
Large Language Models (LLMs) have influenced various domains, leveraging their exceptional few-shot and zero-shot learning capabilities.
We aim to explore and understand the LLMs-based feature selection methods from a data-centric perspective.
Our findings emphasize the effectiveness and robustness of text-based feature selection methods and showcase their potentials using a real-world medical application.
arXiv Detail & Related papers (2024-08-21T22:35:19Z) - LLM-Select: Feature Selection with Large Language Models [64.5099482021597]
Large language models (LLMs) are capable of selecting the most predictive features, with performance rivaling the standard tools of data science.
Our findings suggest that LLMs may be useful not only for selecting the best features for training but also for deciding which features to collect in the first place.
arXiv Detail & Related papers (2024-07-02T22:23:40Z) - LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named
Entity Recognition [67.96794382040547]
$LLM-DA$ is a novel data augmentation technique based on large language models (LLMs) for the few-shot NER task.
Our approach involves employing 14 contextual rewriting strategies, designing entity replacements of the same type, and incorporating noise injection to enhance robustness.
arXiv Detail & Related papers (2024-02-22T14:19:56Z) - LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection.
We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks.
Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z) - Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes [57.62036621319563]
We introduce CLLM, which leverages the prior knowledge of Large Language Models (LLMs) for data augmentation in the low-data regime.
We demonstrate the superior performance of CLLM in the low-data regime compared to conventional generators.
arXiv Detail & Related papers (2023-12-19T12:34:46Z) - Large Language Models as Data Preprocessors [9.99065004972981]
Large Language Models (LLMs) have marked a significant advancement in artificial intelligence.
This study explores their potential in data preprocessing, a critical stage in data mining and analytics applications.
We propose an LLM-based framework for data preprocessing, which integrates cutting-edge prompt engineering techniques.
arXiv Detail & Related papers (2023-08-30T23:28:43Z) - Instruction Tuning for Large Language Models: A Survey [52.86322823501338]
We make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications.
We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research.
arXiv Detail & Related papers (2023-08-21T15:35:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.