Harnessing the Power of David against Goliath: Exploring Instruction
Data Generation without Using Closed-Source Models
- URL: http://arxiv.org/abs/2308.12711v1
- Date: Thu, 24 Aug 2023 11:07:47 GMT
- Title: Harnessing the Power of David against Goliath: Exploring Instruction
Data Generation without Using Closed-Source Models
- Authors: Yue Wang, Xinrui Wang, Juntao Li, Jinxiong Chang, Qishen Zhang,
Zhongyi Liu, Guannan Zhang, Min Zhang
- Abstract summary: We explore alternative approaches to generate high-quality instruction data that do not rely on closed-source models.
Evaluation results from two benchmarks and the GPT-4 model demonstrate the effectiveness of our generated instruction data.
- Score: 32.41573520305861
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Instruction tuning is instrumental in enabling Large Language Models~(LLMs)
to follow user instructions to complete various open-domain tasks. The success
of instruction tuning depends on the availability of high-quality instruction
data. Owing to the exorbitant cost and substandard quality of human annotation,
recent works have been deeply engaged in the exploration of the utilization of
powerful closed-source models to generate instruction data automatically.
However, these methods carry potential risks arising from the usage
requirements of powerful closed-source models, which strictly forbid the
utilization of their outputs to develop machine learning models. To deal with
this problem, in this work, we explore alternative approaches to generate
high-quality instruction data that do not rely on closed-source models. Our
exploration includes an investigation of various existing instruction
generation methods, culminating in the integration of the most efficient
variant with two novel strategies to enhance the quality further. Evaluation
results from two benchmarks and the GPT-4 model demonstrate the effectiveness
of our generated instruction data, which can outperform Alpaca, a method
reliant on closed-source models. We hope that more progress can be achieved in
generating high-quality instruction data without using closed-source models.
Related papers
- Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data [21.905041803331113]
Vision-Language Models (VLMs) have recently made significant progress, but the limited scale and quality of open-source instruction data hinder their performance.
We introduce Infinity-MM, a large-scale multimodal instruction dataset with 40 million samples, enhanced through rigorous quality filtering and deduplication.
We also propose a synthetic instruction generation method based on open-source VLMs, using detailed image annotations and diverse question generation.
arXiv Detail & Related papers (2024-10-24T09:03:48Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation.
Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - How Far Can Camels Go? Exploring the State of Instruction Tuning on Open
Resources [117.6496550359768]
This work explores recent advances in instruction-tuning language models on a range of open instruction-following datasets.
We provide a large set of instruction-tuned models from 6.7B to 65B parameters in size, trained on 12 instruction datasets.
We evaluate them on their factual knowledge, reasoning, multilinguality, coding, and open-ended instruction following abilities.
arXiv Detail & Related papers (2023-06-07T19:59:23Z) - RLBoost: Boosting Supervised Models using Deep Reinforcement Learning [0.0]
We present RLBoost, an algorithm that uses deep reinforcement learning strategies to evaluate a particular dataset and obtain a model capable of estimating the quality of any new data.
The results of the article show that this model obtains better and more stable results than other state-of-the-art algorithms such as LOO, DataShapley or DVRL.
arXiv Detail & Related papers (2023-05-23T14:38:33Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z) - MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down
Distillation [153.56211546576978]
In this work, we propose that better soft targets with higher compatibil-ity can be generated by using a label generator.
We can employ the meta-learning technique to optimize this label generator.
The experiments are conducted on two standard classificationbenchmarks, namely CIFAR-100 and ILSVRC2012.
arXiv Detail & Related papers (2020-08-27T13:04:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.