UQLegalAI@COLIEE2025: Advancing Legal Case Retrieval with Large Language Models and Graph Neural Networks
- URL: http://arxiv.org/abs/2505.20743v1
- Date: Tue, 27 May 2025 05:32:50 GMT
- Title: UQLegalAI@COLIEE2025: Advancing Legal Case Retrieval with Large Language Models and Graph Neural Networks
- Authors: Yanran Tang, Ruihong Qiu, Zi Huang,
- Abstract summary: Legal case retrieval plays a pivotal role in the legal domain by facilitating the efficient identification of relevant cases.<n>The Competition on Legal Information Extraction and Entailment (COLIEE) is held annually, offering updated benchmark datasets for evaluation.<n>This paper presents a detailed description of CaseLink, the method employed by UQLegalAI, the second highest team in Task 1 of COLIEE 2025.
- Score: 26.294747463024017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Legal case retrieval plays a pivotal role in the legal domain by facilitating the efficient identification of relevant cases, supporting legal professionals and researchers to propose legal arguments and make informed decision-making. To improve retrieval accuracy, the Competition on Legal Information Extraction and Entailment (COLIEE) is held annually, offering updated benchmark datasets for evaluation. This paper presents a detailed description of CaseLink, the method employed by UQLegalAI, the second highest team in Task 1 of COLIEE 2025. The CaseLink model utilises inductive graph learning and Global Case Graphs to capture the intrinsic case connectivity to improve the accuracy of legal case retrieval. Specifically, a large language model specialized in text embedding is employed to transform legal texts into embeddings, which serve as the feature representations of the nodes in the constructed case graph. A new contrastive objective, incorporating a regularization on the degree of case nodes, is proposed to leverage the information within the case reference relationship for model optimization. The main codebase used in our method is based on an open-sourced repo of CaseLink: https://github.com/yanran-tang/CaseLink.
Related papers
- Co-DETECT: Collaborative Discovery of Edge Cases in Text Classification [89.62851347390959]
Co-DETECT (Collaborative Discovery of Edge cases in TExt ClassificaTion) is a novel mixed-initiative annotation framework.<n>It integrates human expertise with automatic annotation guided by large language models.
arXiv Detail & Related papers (2025-07-07T13:48:54Z) - A Data Science Approach to Calcutta High Court Judgments: An Efficient LLM and RAG-powered Framework for Summarization and Similar Cases Retrieval [2.359291431338925]
This research presents a framework to analyze Calcutta High Court verdicts.<n>By fine-tuning the Pegasus model, we achieve significant improvements in the summarization of legal cases.<n>The RAG-powered framework efficiently retrieves similar cases in response to user queries, offering thorough overviews and summaries.
arXiv Detail & Related papers (2025-06-28T20:24:34Z) - A Reproducibility Study of Graph-Based Legal Case Retrieval [1.6819960041696331]
CaseLink is a graph-based method for legal case retrieval.<n>CaseLink captures higher-order relationships of cases going beyond the stand-alone level of documents.<n>Challenges in reproducing novel results have recently been highlighted.
arXiv Detail & Related papers (2025-04-11T10:04:12Z) - Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs [67.54302101989542]
Legal case retrieval aims to provide similar cases as references for a given fact description.
Existing works mainly focus on case-to-case retrieval using lengthy queries.
Data scale is insufficient to satisfy the training requirements of existing data-hungry neural models.
arXiv Detail & Related papers (2024-10-09T06:26:39Z) - CLERC: A Dataset for Legal Case Retrieval and Retrieval-Augmented Analysis Generation [44.67578050648625]
We transform a large open-source legal corpus into a dataset supporting information retrieval (IR) and retrieval-augmented generation (RAG)
This dataset CLERC is constructed for training and evaluating models on their ability to (1) find corresponding citations for a given piece of legal analysis and to (2) compile the text of these citations into a cogent analysis that supports a reasoning goal.
arXiv Detail & Related papers (2024-06-24T23:57:57Z) - CaseGNN++: Graph Contrastive Learning for Legal Case Retrieval with Graph Augmentation [25.574138465986977]
Legal case retrieval (LCR) is a specialised information retrieval task that aims to find relevant cases to a given query case.
CaseGNN++ is proposed to simultaneously leverage the edge information and additional label data to discover the latent potential of LCR models.
arXiv Detail & Related papers (2024-05-20T05:16:52Z) - DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval [18.058942674792604]
We propose a novel few-shot workflow tailored to the relevant judgment of legal cases.
By comparing the relevance judgments of LLMs and human experts, we empirically show that we can obtain reliable relevance judgments.
arXiv Detail & Related papers (2024-03-27T09:46:56Z) - Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case
Encoding [15.685369142294693]
CaseEncoder is a legal document encoder that leverages fine-grained legal knowledge in both the data sampling and pre-training phases.
CaseEncoder significantly outperforms both existing general pre-training models and legal-specific pre-training models in zero-shot legal case retrieval.
arXiv Detail & Related papers (2023-05-09T12:40:19Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.