LeXFiles and LegalLAMA: Facilitating English Multinational Legal
Language Model Development
- URL: http://arxiv.org/abs/2305.07507v2
- Date: Mon, 22 May 2023 18:20:54 GMT
- Title: LeXFiles and LegalLAMA: Facilitating English Multinational Legal
Language Model Development
- Authors: Ilias Chalkidis, Nicolas Garneau, Catalina Goanta, Daniel Martin Katz,
Anders S{\o}gaard
- Abstract summary: We conduct a detailed analysis on the performance of legal-oriented pre-trained language models (PLMs)
We examine the interplay between their original objective, acquired knowledge, and legal language understanding capacities.
We find that probing performance strongly correlates with upstream performance in related legal topics.
- Score: 8.931169262582442
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this work, we conduct a detailed analysis on the performance of
legal-oriented pre-trained language models (PLMs). We examine the interplay
between their original objective, acquired knowledge, and legal language
understanding capacities which we define as the upstream, probing, and
downstream performance, respectively. We consider not only the models' size but
also the pre-training corpora used as important dimensions in our study. To
this end, we release a multinational English legal corpus (LeXFiles) and a
legal knowledge probing benchmark (LegalLAMA) to facilitate training and
detailed analysis of legal-oriented PLMs. We release two new legal PLMs trained
on LeXFiles and evaluate them alongside others on LegalLAMA and LexGLUE. We
find that probing performance strongly correlates with upstream performance in
related legal topics. On the other hand, downstream performance is mainly
driven by the model's size and prior legal knowledge which can be estimated by
upstream and probing performance. Based on these findings, we can conclude that
both dimensions are important for those seeking the development of
domain-specific PLMs.
Related papers
- Legal Evalutions and Challenges of Large Language Models [42.51294752406578]
We use the OPENAI o1 model as a case study to evaluate the performance of large models in applying legal provisions.
We compare current state-of-the-art LLMs, including open-source, closed-source, and legal-specific models trained specifically for the legal domain.
arXiv Detail & Related papers (2024-11-15T12:23:12Z) - Developing a Pragmatic Benchmark for Assessing Korean Legal Language Understanding in Large Language Models [7.797885529152412]
Large language models (LLMs) have demonstrated remarkable performance in the legal domain.
However their efficacy remains limited for non-standardized tasks and tasks in languages other than English.
This underscores the need for careful evaluation of LLMs within each legal system before application.
arXiv Detail & Related papers (2024-10-11T11:41:02Z) - InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights [108.40766216456413]
We propose a question alignment framework to bridge the gap between large language models' English and non-English performance.
Experiment results show it can boost multilingual performance across diverse reasoning scenarios, model families, and sizes.
We analyze representation space, generated response and data scales, and reveal how question translation training strengthens language alignment within LLMs.
arXiv Detail & Related papers (2024-05-02T14:49:50Z) - Analyzing and Adapting Large Language Models for Few-Shot Multilingual
NLU: Are We There Yet? [82.02076369811402]
Supervised fine-tuning (SFT), supervised instruction tuning (SIT) and in-context learning (ICL) are three alternative, de facto standard approaches to few-shot learning.
We present an extensive and systematic comparison of the three approaches, testing them on 6 high- and low-resource languages, three different NLU tasks, and a myriad of language and domain setups.
Our observations show that supervised instruction tuning has the best trade-off between performance and resource requirements.
arXiv Detail & Related papers (2024-03-04T10:48:13Z) - Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - CMMLU: Measuring massive multitask language understanding in Chinese [133.70911295934746]
This paper introduces a comprehensive Chinese benchmark that covers various subjects, including natural science, social sciences, engineering, and humanities.
CMMLU fills the gap in evaluating the knowledge and reasoning capabilities of large language models within the Chinese context.
arXiv Detail & Related papers (2023-06-15T15:49:51Z) - Large Language Models as Tax Attorneys: A Case Study in Legal
Capabilities Emergence [5.07013500385659]
This paper explores Large Language Models' (LLMs) capabilities in applying tax law.
Our experiments demonstrate emerging legal understanding capabilities, with improved performance in each subsequent OpenAI model release.
Findings indicate that LLMs, particularly when combined with prompting enhancements and the correct legal texts, can perform at high levels of accuracy but not yet at expert tax lawyer levels.
arXiv Detail & Related papers (2023-06-12T12:40:48Z) - LexGLUE: A Benchmark Dataset for Legal Language Understanding in English [15.026117429782996]
We introduce the Legal General Language Evaluation (LexGLUE) benchmark, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks.
We also provide an evaluation and analysis of several generic and legal-oriented models demonstrating that the latter consistently offer performance improvements across multiple tasks.
arXiv Detail & Related papers (2021-10-03T10:50:51Z) - Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding.
We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.