Related papers: A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment

A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment

URL: http://arxiv.org/abs/2308.05696v2
Date: Thu, 29 Feb 2024 03:04:22 GMT
Title: A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment
Authors: Yingxiu Zhao, Bowen Yu, Binyuan Hui, Haiyang Yu, Fei Huang, Yongbin Li, Nevin L. Zhang
Abstract summary: We propose Tree-Instruct to systematically enhance the instruction complexity in a controllable manner. By adding a specified number of nodes to instructions' semantic trees, this approach not only yields new instruction data but also allows us to control the difficulty level of modified instructions.
Score: 90.7443414448245
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training large language models (LLMs) with open-domain instruction data has yielded remarkable success in aligning to end tasks and human preferences. Extensive research has highlighted the importance of the quality and diversity of instruction data. However, the impact of data complexity, as a crucial metric, remains relatively unexplored from three aspects: (1)where the sustainability of performance improvements with increasing complexity is uncertain; (2)whether the improvement brought by complexity merely comes from introducing more training tokens; and (3)where the potential benefits of incorporating instructions from easy to difficult are not yet fully understood. In this paper, we propose Tree-Instruct to systematically enhance the instruction complexity in a controllable manner. By adding a specified number of nodes to instructions' semantic trees, this approach not only yields new instruction data from the modified tree but also allows us to control the difficulty level of modified instructions. Our preliminary experiments reveal the following insights: (1)Increasing complexity consistently leads to sustained performance improvements of LLMs. (2)Under the same token budget, a few complex instructions outperform diverse yet simple instructions. (3)Curriculum instruction tuning might not yield the anticipated results; focusing on increasing complexity appears to be the key.

Related papers

AIR: Complex Instruction Generation via Automatic Iterative Refinement [29.639832268719363]
Current approaches to generating complex instructions are often irrelevant to the current instruction requirements. We propose a novel automatic iterative refinement framework to generate complex instructions with constraints. We construct the AIR-10K dataset with 10K complex instructions and demonstrate that instructions generated with our approach significantly improve the model's ability to follow complex instructions.
arXiv Detail & Related papers (2025-02-25T02:39:57Z)
MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training [36.483136685734735]
We propose a Multi-granularity Self-Contrastive Training (MuSC) framework to improve the complex instruction alignment without relying on a stronger model. Our method is evaluated on open-sourced models, and experiment results show our method achieves significant improvement on both complex and general instruction-following benchmarks.
arXiv Detail & Related papers (2025-02-17T08:12:49Z)
Enhancing LLM Character-Level Manipulation via Divide and Conquer [74.55804812450164]
Large Language Models (LLMs) have demonstrated strong generalization capabilities across a wide range of natural language processing (NLP) tasks. They exhibit notable weaknesses in character-level string manipulation, struggling with fundamental operations such as character deletion, insertion, and substitution. We propose Character-Level Manipulation via Divide and Conquer, a novel approach designed to bridge the gap between token-level processing and character-level manipulation.
arXiv Detail & Related papers (2025-02-12T07:37:39Z)
A Systematic Examination of Preference Learning through the Lens of Instruction-Following [83.71180850955679]
We use a novel synthetic data generation pipeline to generate 48,000 instruction unique-following prompts. With our synthetic prompts, we use two preference dataset curation methods - rejection sampling (RS) and Monte Carlo Tree Search (MCTS) Experiments reveal that shared prefixes in preference pairs, as generated by MCTS, provide marginal but consistent improvements. High-contrast preference pairs generally outperform low-contrast pairs; however, combining both often yields the best performance.
arXiv Detail & Related papers (2024-12-18T15:38:39Z)
Constraint Back-translation Improves Complex Instruction Following of Large Language Models [55.60192044049083]
Large language models (LLMs) struggle to follow instructions with complex constraints in format, length, etc. Previous works conduct post-training on complex instruction-response pairs generated by feeding complex instructions to advanced LLMs. We propose a novel data generation technique, constraint back-translation.
arXiv Detail & Related papers (2024-10-31T17:42:26Z)
TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution [27.949846287419998]
TaCIE redefines instruction evolution from merely evolving seed instructions to a more dynamic and comprehensive combination of elements. Applying TaCIE across multiple domains, LLMs fine-tuned with these evolved instructions have substantially outperformed those tuned with conventional methods.
arXiv Detail & Related papers (2024-09-18T10:06:28Z)
Benchmarking Complex Instruction-Following with Multiple Constraints Composition [72.82640456309821]
How to evaluate the ability of complex instruction-following of large language models (LLMs) has become a critical research problem. Existing benchmarks mainly focus on modeling different types of constraints in human instructions while neglecting the composition of different constraints. We propose ComplexBench, a benchmark for comprehensively evaluating the ability of LLMs to follow complex instructions composed of multiple constraints.
arXiv Detail & Related papers (2024-07-04T14:50:45Z)
Enhancing and Assessing Instruction-Following with Fine-Grained Instruction Variants [28.691691883519542]
We introduce a technique that decomposes complex instructions into simpler sub-components, modifies these, and reconstructs them into new variants. Based on DeMoRecon, we developed the FGIV dataset which contains fine-grained instruction variants of 1,773 seed instructions. Our findings show that LLMs fine-tuned with FGIV will gain significant performance boost on both ours and commonly used instructions-following benchmarks.
arXiv Detail & Related papers (2024-06-17T08:08:11Z)
From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large Language Models [43.869374263102934]
We study what training data is effective in enhancing complex constraints following abilities. We find that training LLMs with instructions containing multiple constraints enhances their understanding of complex instructions. Our methods improve models' ability to follow instructions generally and generalize effectively across out-of-domain, in-domain, and adversarial settings.
arXiv Detail & Related papers (2024-04-24T12:51:14Z)
Genetic Programming for Explainable Manifold Learning [2.370068482059863]
We introduce Genetic Programming for Explainable Manifold Learning (GP-EMaL), a novel approach that directly penalises tree complexity. Our new method is able to maintain high manifold quality while significantly enhancing explainability and also allows customisation of complexity measures.
arXiv Detail & Related papers (2024-03-21T05:17:22Z)
Data Diversity Matters for Robust Instruction Tuning [129.83575908023312]
Recent works have shown that by curating high quality and diverse instruction tuning datasets, we can significantly improve instruction-following capabilities. We propose a new algorithm, Quality-Diversity Instruction Tuning (QDIT) to control dataset diversity and quality. We validate the performance of QDIT on several large scale instruction tuning datasets, where we find it can substantially improve worst and average case performance.
arXiv Detail & Related papers (2023-11-21T19:12:18Z)
Instruction Tuning with Human Curriculum [15.025867460765559]
We introduce Curriculum Instruction Tuning, (2) explore the potential advantages of employing diverse curriculum strategies, and (3) delineate a synthetic instruction-response generation framework. Our generation pipeline is systematically structured to emulate the sequential and orderly characteristic of human learning. We describe a methodology for generating instruction-response datasets that extensively span the various stages of human education.
arXiv Detail & Related papers (2023-10-14T07:16:08Z)
When Do Program-of-Thoughts Work for Reasoning? [51.2699797837818]
We propose complexity-impacted reasoning score (CIRS) to measure correlation between code and reasoning abilities. Specifically, we use the abstract syntax tree to encode the structural information and calculate logical complexity. Code will be integrated into the EasyInstruct framework at https://github.com/zjunlp/EasyInstruct.
arXiv Detail & Related papers (2023-08-29T17:22:39Z)
Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information [77.19830787312743]
In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand. We introduce a new problem setting for reinforcement learning, the Exogenous Decision Process (ExoMDP), in which the state space admits an (unknown) factorization into a small controllable component and a large irrelevant component. We provide a new algorithm, ExoRL, which learns a near-optimal policy with sample complexity in the size of the endogenous component.
arXiv Detail & Related papers (2022-06-09T05:19:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.