NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource
Languages through Data Enrichment
- URL: http://arxiv.org/abs/2309.05500v1
- Date: Mon, 11 Sep 2023 14:43:45 GMT
- Title: NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource
Languages through Data Enrichment
- Authors: Hai-Long Nguyen, Dieu-Quynh Nguyen, Hoang-Trung Nguyen, Thu-Trang
Pham, Huu-Dong Nguyen, Thach-Anh Nguyen, Thi-Hai-Yen Vuong, Ha-Thanh Nguyen
- Abstract summary: This paper presents NeCo Team's solutions to the Vietnamese text processing tasks provided in the Automated Legal Question Answering Competition 2023 (ALQAC 2023)
Our methods for the legal document retrieval task employ a combination of similarity ranking and deep learning models, while for the second task, we propose a range of adaptive techniques to handle different question types.
Our approaches achieve outstanding results on both tasks of the competition, demonstrating the potential benefits and effectiveness of question answering systems in the legal field.
- Score: 2.441072488254427
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, natural language processing has gained significant
popularity in various sectors, including the legal domain. This paper presents
NeCo Team's solutions to the Vietnamese text processing tasks provided in the
Automated Legal Question Answering Competition 2023 (ALQAC 2023), focusing on
legal domain knowledge acquisition for low-resource languages through data
enrichment. Our methods for the legal document retrieval task employ a
combination of similarity ranking and deep learning models, while for the
second task, which requires extracting an answer from a relevant legal article
in response to a question, we propose a range of adaptive techniques to handle
different question types. Our approaches achieve outstanding results on both
tasks of the competition, demonstrating the potential benefits and
effectiveness of question answering systems in the legal field, particularly
for low-resource languages.
Related papers
- InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - Judgement Citation Retrieval using Contextual Similarity [0.0]
We propose a methodology that combines natural language processing (NLP) and machine learning techniques to enhance the organization and utilization of legal case descriptions.
Our methodology addresses two primary objectives: unsupervised clustering and supervised citation retrieval.
Our methodology achieved an impressive accuracy rate of 90.9%.
arXiv Detail & Related papers (2024-05-28T04:22:28Z) - Can a Multichoice Dataset be Repurposed for Extractive Question Answering? [52.28197971066953]
We repurposed the Belebele dataset (Bandarkar et al., 2023), which was designed for multiple-choice question answering (MCQA)
We present annotation guidelines and a parallel EQA dataset for English and Modern Standard Arabic (MSA).
Our aim is to enable others to adapt our approach for the 120+ other language variants in Belebele, many of which are deemed under-resourced.
arXiv Detail & Related papers (2024-04-26T11:46:05Z) - CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information
Retrieval and Entailment Tasks [7.0271825812050555]
This paper outlines our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition.
Our approach involved utilizing appropriate state-of-the-art deep learning methods, designing methods based on domain characteristics observation, and applying meticulous engineering practices and methodologies to the competition.
arXiv Detail & Related papers (2024-01-07T17:23:27Z) - NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal
Information Processing [1.5593460008414899]
We present the NOWJ team's approach to the COLIEE 2023 Competition, which focuses on advancing legal information processing techniques.
We employ state-of-the-art machine learning models and innovative approaches, such as BERT, Longformer, BM25-ranking algorithm, and multi-task learning models.
arXiv Detail & Related papers (2023-06-08T03:10:49Z) - SemEval 2023 Task 6: LegalEval - Understanding Legal Texts [2.172613863157655]
There is a need for developing NLP-based techniques for processing and automatically understanding legal documents.
LegalEval task has three sub-tasks: Task-A (Rhetorical Roles Labeling) is about automatically structuring legal documents into semantically coherent units, Task-B (Legal Named Entity Recognition) deals with identifying relevant entities in a legal document, Task-C (Court Judgement Prediction with Explanation) explores the possibility of automatically predicting the outcome of a legal case.
In each of the sub-tasks, the proposed systems outperformed the baselines; however, there is a lot of scope for
arXiv Detail & Related papers (2023-04-19T10:28:32Z) - Miko Team: Deep Learning Approach for Legal Question Answering in ALQAC
2022 [2.242125769416219]
We introduce efficient deep learning-based methods for legal document processing in the Automated Legal Question Answering Competition (ALQAC 2022)
Our method is based on the XLM-RoBERTa model that is pre-trained from a large amount of unlabeled corpus before fine-tuning to the specific tasks.
The experimental results showed that our method works well in legal retrieval information tasks with limited labeled data.
arXiv Detail & Related papers (2022-11-04T00:50:20Z) - Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding.
We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z) - Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task
Feasibility in Interactive Visual Environments [54.405920619915655]
We introduce Mobile app Tasks with Iterative Feedback (MoTIF), a dataset with natural language commands for the greatest number of interactive environments to date.
MoTIF is the first to contain natural language requests for interactive environments that are not satisfiable.
We perform initial feasibility classification experiments and only reach an F1 score of 37.3, verifying the need for richer vision-language representations.
arXiv Detail & Related papers (2021-04-17T14:48:02Z) - Enhancing Answer Boundary Detection for Multilingual Machine Reading
Comprehension [86.1617182312817]
We propose two auxiliary tasks in the fine-tuning stage to create additional phrase boundary supervision.
A mixed Machine Reading task, which translates the question or passage to other languages and builds cross-lingual question-passage pairs.
A language-agnostic knowledge masking task by leveraging knowledge phrases mined from web.
arXiv Detail & Related papers (2020-04-29T10:44:00Z) - How Does NLP Benefit Legal System: A Summary of Legal Artificial
Intelligence [81.04070052740596]
Legal Artificial Intelligence (LegalAI) focuses on applying the technology of artificial intelligence, especially natural language processing, to benefit tasks in the legal domain.
This paper introduces the history, the current state, and the future directions of research in LegalAI.
arXiv Detail & Related papers (2020-04-25T14:45:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.