AIR: Complex Instruction Generation via Automatic Iterative Refinement
- URL: http://arxiv.org/abs/2502.17787v2
- Date: Thu, 27 Feb 2025 16:42:10 GMT
- Title: AIR: Complex Instruction Generation via Automatic Iterative Refinement
- Authors: Wei Liu, Yancheng He, Hui Huang, Chengwei Hu, Jiaheng Liu, Shilong Li, Wenbo Su, Bo Zheng,
- Abstract summary: Current approaches to generating complex instructions are often irrelevant to the current instruction requirements.<n>We propose a novel automatic iterative refinement framework to generate complex instructions with constraints.<n>We construct the AIR-10K dataset with 10K complex instructions and demonstrate that instructions generated with our approach significantly improve the model's ability to follow complex instructions.
- Score: 29.639832268719363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the development of large language models, their ability to follow simple instructions has significantly improved. However, adhering to complex instructions remains a major challenge. Current approaches to generating complex instructions are often irrelevant to the current instruction requirements or suffer from limited scalability and diversity. Moreover, methods such as back-translation, while effective for simple instruction generation, fail to leverage the rich contents and structures in large web corpora. In this paper, we propose a novel automatic iterative refinement framework to generate complex instructions with constraints, which not only better reflects the requirements of real scenarios but also significantly enhances LLMs' ability to follow complex instructions. The AIR framework consists of two stages: (1)Generate an initial instruction from a document; (2)Iteratively refine instructions with LLM-as-judge guidance by comparing the model's output with the document to incorporate valuable constraints. Finally, we construct the AIR-10K dataset with 10K complex instructions and demonstrate that instructions generated with our approach significantly improve the model's ability to follow complex instructions, outperforming existing methods for instruction generation.
Related papers
- MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training [36.483136685734735]
We propose a Multi-granularity Self-Contrastive Training (MuSC) framework to improve the complex instruction alignment without relying on a stronger model.<n>Our method is evaluated on open-sourced models, and experiment results show our method achieves significant improvement on both complex and general instruction-following benchmarks.
arXiv Detail & Related papers (2025-02-17T08:12:49Z) - Constraint Back-translation Improves Complex Instruction Following of Large Language Models [55.60192044049083]
Large language models (LLMs) struggle to follow instructions with complex constraints in format, length, etc.
Previous works conduct post-training on complex instruction-response pairs generated by feeding complex instructions to advanced LLMs.
We propose a novel data generation technique, constraint back-translation.
arXiv Detail & Related papers (2024-10-31T17:42:26Z) - Evolutionary Contrastive Distillation for Language Model Alignment [35.94171633370035]
Evolutionary Contrastive Distillation (ECD) is a novel method for generating high-quality synthetic preference data.
Our method yields a 7B model that exceeds the complex instruction-following performance of current SOTA 7B models.
arXiv Detail & Related papers (2024-10-10T01:04:03Z) - TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution [27.949846287419998]
TaCIE redefines instruction evolution from merely evolving seed instructions to a more dynamic and comprehensive combination of elements.
Applying TaCIE across multiple domains, LLMs fine-tuned with these evolved instructions have substantially outperformed those tuned with conventional methods.
arXiv Detail & Related papers (2024-09-18T10:06:28Z) - Controllable Navigation Instruction Generation with Chain of Thought Prompting [74.34604350917273]
We propose C-Instructor, which utilizes the chain-of-thought-style prompt for style-controllable and content-controllable instruction generation.
C-Instructor renders generated instructions more accessible to follow and offers greater controllability over the manipulation of landmark objects.
arXiv Detail & Related papers (2024-07-10T07:37:20Z) - Benchmarking Complex Instruction-Following with Multiple Constraints Composition [72.82640456309821]
How to evaluate the ability of complex instruction-following of large language models (LLMs) has become a critical research problem.
Existing benchmarks mainly focus on modeling different types of constraints in human instructions while neglecting the composition of different constraints.
We propose ComplexBench, a benchmark for comprehensively evaluating the ability of LLMs to follow complex instructions composed of multiple constraints.
arXiv Detail & Related papers (2024-07-04T14:50:45Z) - Ada-Instruct: Adapting Instruction Generators for Complex Reasoning [14.456571495691561]
We introduce Ada-Instruct, an adaptive instruction generator developed through fine-tuning.
We empirically validated Ada-Instruct's efficacy across different applications.
arXiv Detail & Related papers (2023-10-06T13:28:04Z) - Can Large Language Models Understand Real-World Complex Instructions? [54.86632921036983]
Large language models (LLMs) can understand human instructions, but struggle with complex instructions.
Existing benchmarks are insufficient to assess LLMs' ability to understand complex instructions.
We propose CELLO, a benchmark for evaluating LLMs' ability to follow complex instructions systematically.
arXiv Detail & Related papers (2023-09-17T04:18:39Z) - A Preliminary Study of the Intrinsic Relationship between Complexity and
Alignment [90.7443414448245]
We propose Tree-Instruct to systematically enhance the instruction complexity in a controllable manner.
By adding a specified number of nodes to instructions' semantic trees, this approach not only yields new instruction data but also allows us to control the difficulty level of modified instructions.
arXiv Detail & Related papers (2023-08-10T16:58:51Z) - Improving Long-Horizon Imitation Through Instruction Prediction [93.47416552953075]
In this work, we explore the use of an often unused source of auxiliary supervision: language.
Inspired by recent advances in transformer-based models, we train agents with an instruction prediction loss that encourages learning temporally extended representations that operate at a high level of abstraction.
In further analysis we find that instruction modeling is most important for tasks that require complex reasoning, while understandably offering smaller gains in environments that require simple plans.
arXiv Detail & Related papers (2023-06-21T20:47:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.