CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information
Retrieval and Entailment Tasks
- URL: http://arxiv.org/abs/2401.03551v1
- Date: Sun, 7 Jan 2024 17:23:27 GMT
- Title: CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information
Retrieval and Entailment Tasks
- Authors: Chau Nguyen, Phuong Nguyen, Thanh Tran, Dat Nguyen, An Trieu, Tin
Pham, Anh Dang, Le-Minh Nguyen
- Abstract summary: This paper outlines our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition.
Our approach involved utilizing appropriate state-of-the-art deep learning methods, designing methods based on domain characteristics observation, and applying meticulous engineering practices and methodologies to the competition.
- Score: 7.0271825812050555
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Competition on Legal Information Extraction/Entailment (COLIEE) is held
annually to encourage advancements in the automatic processing of legal texts.
Processing legal documents is challenging due to the intricate structure and
meaning of legal language. In this paper, we outline our strategies for
tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition. Our
approach involved utilizing appropriate state-of-the-art deep learning methods,
designing methods based on domain characteristics observation, and applying
meticulous engineering practices and methodologies to the competition. As a
result, our performance in these tasks has been outstanding, with first places
in Task 2 and Task 3, and promising results in Task 4. Our source code is
available at https://github.com/Nguyen2015/CAPTAIN-COLIEE2023/tree/coliee2023.
Related papers
- InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - Little Giants: Exploring the Potential of Small LLMs as Evaluation
Metrics in Summarization in the Eval4NLP 2023 Shared Task [53.163534619649866]
This paper focuses on assessing the effectiveness of prompt-based techniques to empower Large Language Models to handle the task of quality estimation.
We conducted systematic experiments with various prompting techniques, including standard prompting, prompts informed by annotator instructions, and innovative chain-of-thought prompting.
Our work reveals that combining these approaches using a "small", open source model (orca_mini_v3_7B) yields competitive results.
arXiv Detail & Related papers (2023-11-01T17:44:35Z) - NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource
Languages through Data Enrichment [2.441072488254427]
This paper presents NeCo Team's solutions to the Vietnamese text processing tasks provided in the Automated Legal Question Answering Competition 2023 (ALQAC 2023)
Our methods for the legal document retrieval task employ a combination of similarity ranking and deep learning models, while for the second task, we propose a range of adaptive techniques to handle different question types.
Our approaches achieve outstanding results on both tasks of the competition, demonstrating the potential benefits and effectiveness of question answering systems in the legal field.
arXiv Detail & Related papers (2023-09-11T14:43:45Z) - NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal
Information Processing [1.5593460008414899]
We present the NOWJ team's approach to the COLIEE 2023 Competition, which focuses on advancing legal information processing techniques.
We employ state-of-the-art machine learning models and innovative approaches, such as BERT, Longformer, BM25-ranking algorithm, and multi-task learning models.
arXiv Detail & Related papers (2023-06-08T03:10:49Z) - ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich
Document Images [198.35937007558078]
The competition opened on 30th December, 2022 and closed on 24th March, 2023.
There are 35 participants and 91 valid submissions received for Track 1, and 15 participants and 26 valid submissions received for Track 2.
According to the performance of the submissions, we believe there is still a large gap on the expected information extraction performance for complex and zero-shot scenarios.
arXiv Detail & Related papers (2023-06-05T22:20:52Z) - THUIR@COLIEE 2023: More Parameters and Legal Knowledge for Legal Case
Entailment [16.191450092389722]
This paper describes the approach of the THUIR team at the COLIEE 2023 Legal Case Entailment task.
We try traditional lexical matching methods and pre-trained language models with different sizes.
We get the third place in COLIEE 2023.
arXiv Detail & Related papers (2023-05-11T14:11:48Z) - THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained
Language Models for Legal Case Retrieval [16.191450092389722]
This paper summarizes the approach of the championship team THUIR in COLIEE 2023.
To be specific, we design structure-aware pre-trained language models to enhance the understanding of legal cases.
In the end, learning-to-rank methods are employed to merge features with different dimensions.
arXiv Detail & Related papers (2023-05-11T14:08:53Z) - ICDAR 2023 Competition on Reading the Seal Title [58.866588777012744]
To promote research in this area, we organized ICDAR 2023 competition on reading the seal title (ReST)
We constructed a dataset of 10,000 real seal data, covering the most common classes of seals, and labeled all seal title texts with text and text contents.
The competition attracted 53 participants from academia and industry including 28 submissions for Task 1 and 25 submissions for Task 2, which demonstrated significant interest in this challenging task.
arXiv Detail & Related papers (2023-04-24T10:01:41Z) - An Uncommon Task: Participatory Design in Legal AI [64.54460979588075]
We examine a notable yet understudied AI design process in the legal domain that took place over a decade ago.
We show how an interactive simulation methodology allowed computer scientists and lawyers to become co-designers.
arXiv Detail & Related papers (2022-03-08T15:46:52Z) - JNLP Team: Deep Learning Approaches for Legal Processing Tasks in COLIEE
2021 [1.8700700550095686]
COLIEE is an annual competition in automatic computerized legal text processing.
In this article, we report our methods and experimental results in using deep learning in legal document processing.
arXiv Detail & Related papers (2021-06-25T03:31:12Z) - Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.
Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved.
We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.