Miko Team: Deep Learning Approach for Legal Question Answering in ALQAC
2022
- URL: http://arxiv.org/abs/2211.02200v1
- Date: Fri, 4 Nov 2022 00:50:20 GMT
- Title: Miko Team: Deep Learning Approach for Legal Question Answering in ALQAC
2022
- Authors: Hieu Nguyen Van, Dat Nguyen, Phuong Minh Nguyen and Minh Le Nguyen
- Abstract summary: We introduce efficient deep learning-based methods for legal document processing in the Automated Legal Question Answering Competition (ALQAC 2022)
Our method is based on the XLM-RoBERTa model that is pre-trained from a large amount of unlabeled corpus before fine-tuning to the specific tasks.
The experimental results showed that our method works well in legal retrieval information tasks with limited labeled data.
- Score: 2.242125769416219
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce efficient deep learning-based methods for legal document
processing including Legal Document Retrieval and Legal Question Answering
tasks in the Automated Legal Question Answering Competition (ALQAC 2022). In
this competition, we achieve 1\textsuperscript{st} place in the first task and
3\textsuperscript{rd} place in the second task. Our method is based on the
XLM-RoBERTa model that is pre-trained from a large amount of unlabeled corpus
before fine-tuning to the specific tasks. The experimental results showed that
our method works well in legal retrieval information tasks with limited labeled
data. Besides, this method can be applied to other information retrieval tasks
in low-resource languages.
Related papers
- Judgement Citation Retrieval using Contextual Similarity [0.0]
We propose a methodology that combines natural language processing (NLP) and machine learning techniques to enhance the organization and utilization of legal case descriptions.
Our methodology addresses two primary objectives: unsupervised clustering and supervised citation retrieval.
Our methodology achieved an impressive accuracy rate of 90.9%.
arXiv Detail & Related papers (2024-05-28T04:22:28Z) - PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval [76.50690734636477]
We propose PromptReps, which combines the advantages of both categories: no need for training and the ability to retrieve from the whole corpus.
The retrieval system harnesses both dense text embedding and sparse bag-of-words representations.
arXiv Detail & Related papers (2024-04-29T04:51:30Z) - CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information
Retrieval and Entailment Tasks [7.0271825812050555]
This paper outlines our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition.
Our approach involved utilizing appropriate state-of-the-art deep learning methods, designing methods based on domain characteristics observation, and applying meticulous engineering practices and methodologies to the competition.
arXiv Detail & Related papers (2024-01-07T17:23:27Z) - NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource
Languages through Data Enrichment [2.441072488254427]
This paper presents NeCo Team's solutions to the Vietnamese text processing tasks provided in the Automated Legal Question Answering Competition 2023 (ALQAC 2023)
Our methods for the legal document retrieval task employ a combination of similarity ranking and deep learning models, while for the second task, we propose a range of adaptive techniques to handle different question types.
Our approaches achieve outstanding results on both tasks of the competition, demonstrating the potential benefits and effectiveness of question answering systems in the legal field.
arXiv Detail & Related papers (2023-09-11T14:43:45Z) - DAPR: A Benchmark on Document-Aware Passage Retrieval [57.45793782107218]
We propose and name this task emphDocument-Aware Passage Retrieval (DAPR)
While analyzing the errors of the State-of-The-Art (SoTA) passage retrievers, we find the major errors (53.5%) are due to missing document context.
Our created benchmark enables future research on developing and comparing retrieval systems for the new task.
arXiv Detail & Related papers (2023-05-23T10:39:57Z) - THUIR@COLIEE 2023: More Parameters and Legal Knowledge for Legal Case
Entailment [16.191450092389722]
This paper describes the approach of the THUIR team at the COLIEE 2023 Legal Case Entailment task.
We try traditional lexical matching methods and pre-trained language models with different sizes.
We get the third place in COLIEE 2023.
arXiv Detail & Related papers (2023-05-11T14:11:48Z) - Socratic Pretraining: Question-Driven Pretraining for Controllable
Summarization [89.04537372465612]
Socratic pretraining is a question-driven, unsupervised pretraining objective designed to improve controllability in summarization tasks.
Our results show that Socratic pretraining cuts task-specific labeled data requirements in half.
arXiv Detail & Related papers (2022-12-20T17:27:10Z) - Recitation-Augmented Language Models [85.30591349383849]
We show that RECITE is a powerful paradigm for knowledge-intensive NLP tasks.
Specifically, we show that by utilizing recitation as the intermediate step, a recite-and-answer scheme can achieve new state-of-the-art performance.
arXiv Detail & Related papers (2022-10-04T00:49:20Z) - Questions Are All You Need to Train a Dense Passage Retriever [123.13872383489172]
ART is a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data.
It uses a new document-retrieval autoencoding scheme, where (1) an input question is used to retrieve a set of evidence documents, and (2) the documents are then used to compute the probability of reconstructing the original question.
arXiv Detail & Related papers (2022-06-21T18:16:31Z) - JNLP Team: Deep Learning Approaches for Legal Processing Tasks in COLIEE
2021 [1.8700700550095686]
COLIEE is an annual competition in automatic computerized legal text processing.
In this article, we report our methods and experimental results in using deep learning in legal document processing.
arXiv Detail & Related papers (2021-06-25T03:31:12Z) - Tradeoffs in Sentence Selection Techniques for Open-Domain Question
Answering [54.541952928070344]
We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question.
We show that very lightweight QA models can do well at this task, but retrieval-based models are faster still.
arXiv Detail & Related papers (2020-09-18T23:39:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.