Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis
- URL: http://arxiv.org/abs/2502.03843v1
- Date: Thu, 06 Feb 2025 07:53:40 GMT
- Title: Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis
- Authors: Lin Yuan, Jun Xu, Honghao Gui, Mengshu Sun, Zhiqiang Zhang, Lei Liang, Jun Zhou,
- Abstract summary: There is a severe shortage of instruction in the field of natural language understanding.
Previous works on constructing NLU instructions mainly focus on information extraction.
We propose Hum, a large-scale, high-quality synthetic instruction corpus for NLU tasks.
- Score: 13.70936635839382
- License:
- Abstract: High-quality, large-scale instructions are crucial for aligning large language models (LLMs), however, there is a severe shortage of instruction in the field of natural language understanding (NLU). Previous works on constructing NLU instructions mainly focus on information extraction (IE), neglecting tasks such as machine reading comprehension, question answering, and text classification. Furthermore, the lack of diversity in the data has led to a decreased generalization ability of trained LLMs in other NLU tasks and a noticeable decline in the fundamental model's general capabilities. To address this issue, we propose Hum, a large-scale, high-quality synthetic instruction corpus for NLU tasks, designed to enhance the NLU capabilities of LLMs. Specifically, Hum includes IE (either close IE or open IE), machine reading comprehension, text classification, and instruction generalist tasks, thereby enriching task diversity. Additionally, we introduce a human-LLMs collaborative mechanism to synthesize instructions, which enriches instruction diversity by incorporating guidelines, preference rules, and format variants. We conduct extensive experiments on 5 NLU tasks and 28 general capability evaluation datasets for LLMs. Experimental results show that Hum enhances the NLU capabilities of six LLMs by an average of 3.1\%, with no significant decline observed in other general capabilities.
Related papers
- BioMistral-NLU: Towards More Generalizable Medical Language Understanding through Instruction Tuning [19.027921909970832]
We develop BioMistral-NLU, a generalizable medical NLU model, through fine-tuning BioMistral on MNLU-Instruct.
Our experiments show that our BioMistral-NLU outperforms the original BioMistral.
Our dataset-agnostic prompting strategy and instruction tuning step over diverse NLU tasks enhance LLMs' generalizability across diverse medical NLU tasks.
arXiv Detail & Related papers (2024-10-24T17:53:53Z) - Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.
It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.
Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z) - INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning [59.07490387145391]
Large language models (LLMs) have demonstrated impressive capabilities in various natural language processing tasks.
Their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language.
We introduce a novel instruction tuning dataset, INTERS, encompassing 20 tasks across three fundamental IR categories.
arXiv Detail & Related papers (2024-01-12T12:10:28Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - Auto-Instruct: Automatic Instruction Generation and Ranking for
Black-Box Language Models [91.02730155418699]
Large language models (LLMs) can perform a wide range of tasks by following natural language instructions.
We introduce Auto-Instruct, a novel method to automatically improve the quality of instructions provided to LLMs.
In experiments on 118 out-of-domain tasks, Auto-Instruct surpasses both human-written instructions and existing baselines of LLM-generated instructions.
arXiv Detail & Related papers (2023-10-19T19:52:55Z) - Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools.
Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions.
Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z) - Improving Translation Faithfulness of Large Language Models via
Augmenting Instructions [89.76691340615848]
We propose SWIE (Segment-Weighted Instruction Embedding) and an instruction-following dataset OVERMISS.
SWIE improves the model instruction understanding by adding a global instruction representation on the following input and response representations.
OVERMISS improves model faithfulness by comparing over-translation and miss-translation results with the correct translation.
arXiv Detail & Related papers (2023-08-24T09:32:29Z) - Metacognitive Prompting Improves Understanding in Large Language Models [12.112914393948415]
We introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes.
We conduct experiments on four prevalent Large Language Models (LLMs) across ten natural language understanding (NLU) datasets.
MP consistently outperforms existing prompting methods in both general and domain-specific NLU tasks.
arXiv Detail & Related papers (2023-08-10T05:10:17Z) - Prompting Large Language Models for Counterfactual Generation: An
Empirical Study [13.506528217009507]
Large language models (LLMs) have made remarkable progress in a wide range of natural language understanding and generation tasks.
We present a comprehensive evaluation framework on various types of NLU tasks, which covers all key factors in determining LLMs' capability of generating counterfactuals.
arXiv Detail & Related papers (2023-05-24T06:44:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.