FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning
- URL: http://arxiv.org/abs/2404.02127v1
- Date: Tue, 2 Apr 2024 17:33:34 GMT
- Title: FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning
- Authors: Joel Niklaus, Lucia Zheng, Arya D. McCarthy, Christopher Hahn, Brian M. Rosen, Peter Henderson, Daniel E. Ho, Garrett Honke, Percy Liang, Christopher Manning,
- Abstract summary: LawInstruct is a large legal instruction dataset covering 17 jurisdictions, 24 languages and a total of 12M examples.
We present evidence that domain-specific pretraining and instruction tuning improve performance on LegalBench.
LawInstruct is a resource for accelerating the development of models with stronger information processing and decision making capabilities in the legal domain.
- Score: 47.001169623840354
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Instruction tuning is an important step in making language models useful for direct user interaction. However, many legal tasks remain out of reach for most open LLMs and there do not yet exist any large scale instruction datasets for the domain. This critically limits research in this application area. In this work, we curate LawInstruct, a large legal instruction dataset, covering 17 jurisdictions, 24 languages and a total of 12M examples. We present evidence that domain-specific pretraining and instruction tuning improve performance on LegalBench, including improving Flan-T5 XL by 8 points or 16\% over the baseline. However, the effect does not generalize across all tasks, training regimes, model sizes, and other factors. LawInstruct is a resource for accelerating the development of models with stronger information processing and decision making capabilities in the legal domain.
Related papers
- TransformLLM: Adapting Large Language Models via LLM-Transformed Reading Comprehension Text [5.523385345486362]
We have developed language models specifically designed for legal applications.
Our innovative approach significantly improves capabilities in legal tasks by using Large Language Models (LLMs) to convert raw training data into reading comprehension text.
arXiv Detail & Related papers (2024-10-28T19:32:18Z) - The Factuality of Large Language Models in the Legal Domain [8.111302195052641]
This paper investigates the factuality of large language models (LLMs) as knowledge bases in the legal domain.
We design a dataset of diverse factual questions about case law and legislation.
We then use the dataset to evaluate several LLMs under different evaluation methods, including exact, alias, and fuzzy matching.
arXiv Detail & Related papers (2024-09-18T08:30:20Z) - Performance Law of Large Language Models [58.32539851241063]
Performance law can be used to guide the choice of LLM architecture and the effective allocation of computational resources.
Performance law can be used to guide the choice of LLM architecture and the effective allocation of computational resources without extensive experiments.
arXiv Detail & Related papers (2024-08-19T11:09:12Z) - InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models [75.75038268227554]
Self-Checker is a framework comprising a set of plug-and-play modules that facilitate fact-checking.
This framework provides a fast and efficient way to construct fact-checking systems in low-resource environments.
arXiv Detail & Related papers (2023-05-24T01:46:07Z) - Towards Building the Federated GPT: Federated Instruction Tuning [66.7900343035733]
This paper introduces Federated Instruction Tuning (FedIT) as the learning framework for the instruction tuning of large language models (LLMs)
We demonstrate that by exploiting the heterogeneous and diverse sets of instructions on the client's end with FedIT, we improved the performance of LLMs compared to centralized training with only limited local instructions.
arXiv Detail & Related papers (2023-05-09T17:42:34Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - LexGLUE: A Benchmark Dataset for Legal Language Understanding in English [15.026117429782996]
We introduce the Legal General Language Evaluation (LexGLUE) benchmark, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks.
We also provide an evaluation and analysis of several generic and legal-oriented models demonstrating that the latter consistently offer performance improvements across multiple tasks.
arXiv Detail & Related papers (2021-10-03T10:50:51Z) - When Does Pretraining Help? Assessing Self-Supervised Learning for Law
and the CaseHOLD Dataset [2.0924876102146714]
We present a new dataset comprised of over 53,000+ multiple choice questions to identify the relevant holding of a cited case.
We show that domain pretraining may be warranted when the task exhibits sufficient similarity to the pretraining corpus.
Our findings inform when researchers should engage resource-intensive pretraining and show that Transformer-based architectures, too, learn embeddings suggestive of distinct legal language.
arXiv Detail & Related papers (2021-04-18T00:57:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.