GPTs and Language Barrier: A Cross-Lingual Legal QA Examination
- URL: http://arxiv.org/abs/2403.18098v1
- Date: Tue, 26 Mar 2024 20:47:32 GMT
- Title: GPTs and Language Barrier: A Cross-Lingual Legal QA Examination
- Authors: Ha-Thanh Nguyen, Hiroaki Yamada, Ken Satoh,
- Abstract summary: We explore the application of Generative Pre-trained Transformers (GPTs) in cross-lingual legal Question-Answering (QA) systems using the COLIEE Task 4 dataset.
In the COLIEE Task 4, given a statement and a set of related legal articles that serve as context, the objective is to determine whether the statement is legally valid.
By benchmarking four different combinations of English and Japanese prompts and data, we provide valuable insights into GPTs' performance in multilingual legal QA scenarios.
- Score: 5.253214457141011
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we explore the application of Generative Pre-trained Transformers (GPTs) in cross-lingual legal Question-Answering (QA) systems using the COLIEE Task 4 dataset. In the COLIEE Task 4, given a statement and a set of related legal articles that serve as context, the objective is to determine whether the statement is legally valid, i.e., if it can be inferred from the provided contextual articles or not, which is also known as an entailment task. By benchmarking four different combinations of English and Japanese prompts and data, we provide valuable insights into GPTs' performance in multilingual legal QA scenarios, contributing to the development of more efficient and accurate cross-lingual QA solutions in the legal domain.
Related papers
- LegalRAG: A Hybrid RAG System for Multilingual Legal Information Retrieval [7.059964549363294]
We develop an efficient bilingual question-answering framework for regulatory documents, specifically the Bangladesh Police Gazettes.
Our approach employs modern Retrieval Augmented Generation (RAG) pipelines to enhance information retrieval and response generation.
This system enables efficient searching for specific government legal notices, making legal information more accessible.
arXiv Detail & Related papers (2025-04-19T06:09:54Z) - Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Task [73.35882908048423]
Retrieval-augmented generation (RAG) has become a cornerstone of contemporary NLP.
This paper investigates the effectiveness of RAG across multiple languages by proposing novel approaches for multilingual open-domain question-answering.
arXiv Detail & Related papers (2025-04-04T17:35:43Z) - LegalBench.PT: A Benchmark for Portuguese Law [17.554201334646056]
We present LegalBench.PT, the first comprehensive legal benchmark covering key areas of Portuguese law.
We first collect long-form questions and answers from real law exams, and then use GPT-4o to convert them into multiple-choice, true/false, and matching formats.
arXiv Detail & Related papers (2025-02-22T21:07:12Z) - A Cross-Lingual Statutory Article Retrieval Dataset for Taiwan Legal Studies [4.511440076037968]
This paper introduces a cross-lingual statutory article retrieval (SAR) dataset designed to enhance legal information retrieval in multilingual settings.
Our dataset features spoken-language-style legal inquiries in English, paired with corresponding Chinese versions and relevant statutes, covering all Taiwanese civil, criminal, and administrative laws.
arXiv Detail & Related papers (2024-10-15T09:53:40Z) - Developing a Pragmatic Benchmark for Assessing Korean Legal Language Understanding in Large Language Models [7.797885529152412]
Large language models (LLMs) have demonstrated remarkable performance in the legal domain.
However their efficacy remains limited for non-standardized tasks and tasks in languages other than English.
This underscores the need for careful evaluation of LLMs within each legal system before application.
arXiv Detail & Related papers (2024-10-11T11:41:02Z) - InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - Team UTSA-NLP at SemEval 2024 Task 5: Prompt Ensembling for Argument Reasoning in Civil Procedures with GPT4 [7.613758211231583]
We present our system for the SemEval Task 5, The Legal Argument Reasoning Task in Civil Procedure Challenge.
Our system explores a prompt-based solution using GPT4 to reason over legal arguments.
Overall, our system results in a Macro F1 of.8095 on the validation dataset and.7315 (5th out of 21 teams) on the final test set.
arXiv Detail & Related papers (2024-04-02T13:55:05Z) - NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource
Languages through Data Enrichment [2.441072488254427]
This paper presents NeCo Team's solutions to the Vietnamese text processing tasks provided in the Automated Legal Question Answering Competition 2023 (ALQAC 2023)
Our methods for the legal document retrieval task employ a combination of similarity ranking and deep learning models, while for the second task, we propose a range of adaptive techniques to handle different question types.
Our approaches achieve outstanding results on both tasks of the competition, demonstrating the potential benefits and effectiveness of question answering systems in the legal field.
arXiv Detail & Related papers (2023-09-11T14:43:45Z) - SemEval 2023 Task 6: LegalEval - Understanding Legal Texts [2.172613863157655]
There is a need for developing NLP-based techniques for processing and automatically understanding legal documents.
LegalEval task has three sub-tasks: Task-A (Rhetorical Roles Labeling) is about automatically structuring legal documents into semantically coherent units, Task-B (Legal Named Entity Recognition) deals with identifying relevant entities in a legal document, Task-C (Court Judgement Prediction with Explanation) explores the possibility of automatically predicting the outcome of a legal case.
In each of the sub-tasks, the proposed systems outperformed the baselines; however, there is a lot of scope for
arXiv Detail & Related papers (2023-04-19T10:28:32Z) - Understanding Translationese in Cross-Lingual Summarization [106.69566000567598]
Cross-lingual summarization (MS) aims at generating a concise summary in a different target language.
To collect large-scale CLS data, existing datasets typically involve translation in their creation.
In this paper, we first confirm that different approaches of constructing CLS datasets will lead to different degrees of translationese.
arXiv Detail & Related papers (2022-12-14T13:41:49Z) - Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining
for Task-Oriented Dialog [67.20796950016735]
Multi2WOZ dataset spans four typologically diverse languages: Chinese, German, Arabic, and Russian.
We introduce a new framework for multilingual conversational specialization of pretrained language models (PrLMs) that aims to facilitate cross-lingual transfer for arbitrary downstream TOD tasks.
Our experiments show that, in most setups, the best performance entails the combination of (I) conversational specialization in the target language and (ii) few-shot transfer for the concrete TOD task.
arXiv Detail & Related papers (2022-05-20T18:35:38Z) - EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based
Enrolment, Verification, and Identification [49.77911492230467]
We formalise the three authentication tasks and their evaluation protocols.
We present EVI, a challenging spoken multilingual dataset with 5,506 dialogues in English, Polish, and French.
arXiv Detail & Related papers (2022-04-28T13:39:24Z) - Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual
Retrieval [51.60862829942932]
We present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks.
For sentence-level CLIR, we demonstrate that state-of-the-art performance can be achieved.
However, the peak performance is not met using the general-purpose multilingual text encoders off-the-shelf', but rather relying on their variants that have been further specialized for sentence understanding tasks.
arXiv Detail & Related papers (2021-01-21T00:15:38Z) - XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating
Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks.
We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.