NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource
Languages through Data Enrichment
- URL: http://arxiv.org/abs/2309.05500v1
- Date: Mon, 11 Sep 2023 14:43:45 GMT
- Title: NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource
Languages through Data Enrichment
- Authors: Hai-Long Nguyen, Dieu-Quynh Nguyen, Hoang-Trung Nguyen, Thu-Trang
Pham, Huu-Dong Nguyen, Thach-Anh Nguyen, Thi-Hai-Yen Vuong, Ha-Thanh Nguyen
- Abstract summary: This paper presents NeCo Team's solutions to the Vietnamese text processing tasks provided in the Automated Legal Question Answering Competition 2023 (ALQAC 2023)
Our methods for the legal document retrieval task employ a combination of similarity ranking and deep learning models, while for the second task, we propose a range of adaptive techniques to handle different question types.
Our approaches achieve outstanding results on both tasks of the competition, demonstrating the potential benefits and effectiveness of question answering systems in the legal field.
- Score: 2.441072488254427
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, natural language processing has gained significant
popularity in various sectors, including the legal domain. This paper presents
NeCo Team's solutions to the Vietnamese text processing tasks provided in the
Automated Legal Question Answering Competition 2023 (ALQAC 2023), focusing on
legal domain knowledge acquisition for low-resource languages through data
enrichment. Our methods for the legal document retrieval task employ a
combination of similarity ranking and deep learning models, while for the
second task, which requires extracting an answer from a relevant legal article
in response to a question, we propose a range of adaptive techniques to handle
different question types. Our approaches achieve outstanding results on both
tasks of the competition, demonstrating the potential benefits and
effectiveness of question answering systems in the legal field, particularly
for low-resource languages.
Related papers
- ALKAFI-LLAMA3: Fine-Tuning LLMs for Precise Legal Understanding in Palestine [0.0]
This study addresses the challenges of adapting Large Language Models to the Palestinian legal domain.
Political instability, fragmented legal frameworks, and limited AI resources hinder effective machine-learning applications.
We present a fine-tuned model based on a quantized version of Llama-3.2-1B-Instruct, trained on a synthetic data set derived from Palestinian legal texts.
arXiv Detail & Related papers (2024-12-19T11:55:51Z) - PromptRefine: Enhancing Few-Shot Performance on Low-Resource Indic Languages with Example Selection from Related Example Banks [57.86928556668849]
Large Language Models (LLMs) have recently demonstrated impressive few-shot learning capabilities through in-context learning (ICL)
ICL performance is highly dependent on the choice of few-shot demonstrations, making the selection of the most optimal examples a persistent research challenge.
In this work, we propose PromptRefine, a novel Alternating Minimization approach for example selection that improves ICL performance on low-resource Indic languages.
arXiv Detail & Related papers (2024-12-07T17:51:31Z) - Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges [4.548047308860141]
Natural Language Processing is revolutionizing the way legal professionals and laypersons operate in the legal field.
This survey follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses framework, reviewing 148 studies, with a final selection of 127 after manual filtering.
It explores foundational concepts related to Natural Language Processing in the legal domain.
arXiv Detail & Related papers (2024-10-25T01:17:02Z) - Augmenting Legal Decision Support Systems with LLM-based NLI for Analyzing Social Media Evidence [0.0]
This paper presents our system description and error analysis of our entry for NLLP 2024 shared task on Legal Natural Language Inference (L-NLI)
The task required classifying relationships as entailed, contradicted, or neutral, indicating any association between the review and the complaint.
Our system emerged as the winning submission, significantly outperforming other entries with a substantial margin and demonstrating the effectiveness of our approach in legal text analysis.
arXiv Detail & Related papers (2024-10-21T13:20:15Z) - InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - From Multiple-Choice to Extractive QA: A Case Study for English and Arabic [51.13706104333848]
We explore the feasibility of repurposing an existing multilingual dataset for a new NLP task.
We present annotation guidelines and a parallel EQA dataset for English and Modern Standard Arabic.
We aim to help others adapt our approach for the remaining 120 BELEBELE language variants, many of which are deemed under-resourced.
arXiv Detail & Related papers (2024-04-26T11:46:05Z) - SemEval 2023 Task 6: LegalEval - Understanding Legal Texts [2.172613863157655]
There is a need for developing NLP-based techniques for processing and automatically understanding legal documents.
LegalEval task has three sub-tasks: Task-A (Rhetorical Roles Labeling) is about automatically structuring legal documents into semantically coherent units, Task-B (Legal Named Entity Recognition) deals with identifying relevant entities in a legal document, Task-C (Court Judgement Prediction with Explanation) explores the possibility of automatically predicting the outcome of a legal case.
In each of the sub-tasks, the proposed systems outperformed the baselines; however, there is a lot of scope for
arXiv Detail & Related papers (2023-04-19T10:28:32Z) - Miko Team: Deep Learning Approach for Legal Question Answering in ALQAC
2022 [2.242125769416219]
We introduce efficient deep learning-based methods for legal document processing in the Automated Legal Question Answering Competition (ALQAC 2022)
Our method is based on the XLM-RoBERTa model that is pre-trained from a large amount of unlabeled corpus before fine-tuning to the specific tasks.
The experimental results showed that our method works well in legal retrieval information tasks with limited labeled data.
arXiv Detail & Related papers (2022-11-04T00:50:20Z) - Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding.
We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z) - Enhancing Answer Boundary Detection for Multilingual Machine Reading
Comprehension [86.1617182312817]
We propose two auxiliary tasks in the fine-tuning stage to create additional phrase boundary supervision.
A mixed Machine Reading task, which translates the question or passage to other languages and builds cross-lingual question-passage pairs.
A language-agnostic knowledge masking task by leveraging knowledge phrases mined from web.
arXiv Detail & Related papers (2020-04-29T10:44:00Z) - How Does NLP Benefit Legal System: A Summary of Legal Artificial
Intelligence [81.04070052740596]
Legal Artificial Intelligence (LegalAI) focuses on applying the technology of artificial intelligence, especially natural language processing, to benefit tasks in the legal domain.
This paper introduces the history, the current state, and the future directions of research in LegalAI.
arXiv Detail & Related papers (2020-04-25T14:45:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.