Court Judgement Labeling on HKLII
- URL: http://arxiv.org/abs/2208.04225v1
- Date: Wed, 3 Aug 2022 06:32:16 GMT
- Title: Court Judgement Labeling on HKLII
- Authors: Yuchen Liu, Ben Kao, Michael MK Cheung, Tien-Hsuan Wu
- Abstract summary: HKLII has served as the repository of legal documents in Hong Kong for a decade.
Our team aims to incorporate NLP techniques into the website to make it more intelligent.
- Score: 17.937279252256594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: HKLII has served as the repository of legal documents in Hong Kong for a
decade. Our team aims to incorporate NLP techniques into the website to make it
more intelligent. To achieve this goal, this individual task is to label each
court judgement by some tags. These tags are legally important to summarize the
judgement and can guide the user to similar judgements. We introduce a
heuristic system to solve the problem, which starts from Aspect-driven Topic
Modeling and uses Dependency Parsing and Constituency Parsing for phrase
generation. We also construct a legal term tree for Hong Kong and implemented a
sentence simplification module to support the system. Finally, we propose a
similar document recommendation algorithm based on the generated tags. It
enables users to find similar documents based on a few selected aspects rather
than the whole passage. Experiment results show that this system is the best
approach for this specific task. It is better than simple term extraction
method in terms of summarizing the document, and the recommendation algorithm
is more effective than full-text comparison approaches. We believe that the
system has huge potential in law as well as in other areas.
Related papers
- Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings.
First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss.
Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z) - DAPR: A Benchmark on Document-Aware Passage Retrieval [57.45793782107218]
We propose and name this task emphDocument-Aware Passage Retrieval (DAPR)
While analyzing the errors of the State-of-The-Art (SoTA) passage retrievers, we find the major errors (53.5%) are due to missing document context.
Our created benchmark enables future research on developing and comparing retrieval systems for the new task.
arXiv Detail & Related papers (2023-05-23T10:39:57Z) - Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents [61.63208012250885]
We formulate recognizing semantic differences as a token-level regression task.
We study three unsupervised approaches that rely on a masked language model.
Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
arXiv Detail & Related papers (2023-05-22T17:58:04Z) - Rhetorical Role Labeling of Legal Documents using Transformers and Graph
Neural Networks [1.290382979353427]
This paper presents the approaches undertaken to perform the task of rhetorical role labelling on Indian Court Judgements as part of SemEval Task 6: understanding legal texts, shared subtask A.
arXiv Detail & Related papers (2023-05-06T17:04:51Z) - Precise Zero-Shot Dense Retrieval without Relevance Labels [60.457378374671656]
Hypothetical Document Embeddings(HyDE) is a zero-shot dense retrieval system.
We show that HyDE significantly outperforms the state-of-the-art unsupervised dense retriever Contriever.
arXiv Detail & Related papers (2022-12-20T18:09:52Z) - Text Summarization with Oracle Expectation [88.39032981994535]
Extractive summarization produces summaries by identifying and concatenating the most important sentences in a document.
Most summarization datasets do not come with gold labels indicating whether document sentences are summary-worthy.
We propose a simple yet effective labeling algorithm that creates soft, expectation-based sentence labels.
arXiv Detail & Related papers (2022-09-26T14:10:08Z) - An Efficient Coarse-to-Fine Facet-Aware Unsupervised Summarization
Framework based on Semantic Blocks [27.895044398724664]
We propose an efficient Coarse-to-Fine Facet-Aware Ranking (C2F-FAR) framework for unsupervised long document summarization.
In the coarse-level stage, we propose a new segment algorithm to split the document into facet-aware semantic blocks and then filter insignificant blocks.
In the fine-level stage, we select salient sentences in each block and then extract the final summary from selected sentences.
arXiv Detail & Related papers (2022-08-17T12:18:36Z) - An Evaluation Framework for Legal Document Summarization [1.9709122688953327]
A law practitioner has to go through numerous lengthy legal case proceedings for their practices of various categories, such as land dispute, corruption, etc.
It is important to summarize these documents, and ensure that summaries contain phrases with intent matching the category of the case.
We propose an automated intent-based summarization metric, which shows a better agreement with human evaluation as compared to other automated metrics like BLEU, ROUGE-L etc.
arXiv Detail & Related papers (2022-05-17T16:42:03Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z) - Aspect-based Document Similarity for Research Papers [4.661692753666685]
We extend similarity with aspect information by performing a pairwise document classification task.
We evaluate our aspect-based document similarity for research papers.
Our results show SciBERT as the best performing system.
arXiv Detail & Related papers (2020-10-13T13:51:21Z) - Building Legal Case Retrieval Systems with Lexical Matching and
Summarization using A Pre-Trained Phrase Scoring Model [1.9275428660922076]
We present our method for tackling the legal case retrieval task of the Competition on Legal Information Extraction/Entailment 2019.
Our approach is based on the idea that summarization is important for retrieval.
We have achieved the state-of-the-art result for the task on the benchmark of the competition.
arXiv Detail & Related papers (2020-09-29T15:10:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.