Legal Search in Case Law and Statute Law
- URL: http://arxiv.org/abs/2108.10127v1
- Date: Mon, 23 Aug 2021 12:51:24 GMT
- Title: Legal Search in Case Law and Statute Law
- Authors: Julien Rossi, Evangelos Kanoulas
- Abstract summary: We describe a method to identify document pairwise relevance in the context of a typical legal document collection.
We review the usage of generalized language models, including supervised and unsupervised learning.
- Score: 12.697393184074457
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work we describe a method to identify document pairwise relevance in
the context of a typical legal document collection: limited resources, long
queries and long documents. We review the usage of generalized language models,
including supervised and unsupervised learning. We observe how our method,
while using text summaries, overperforms existing baselines based on full text,
and motivate potential improvement directions for future work.
Related papers
- Unified Multi-Modal Interleaved Document Representation for Information Retrieval [57.65409208879344]
We produce more comprehensive and nuanced document representations by holistically embedding documents interleaved with different modalities.
Specifically, we achieve this by leveraging the capability of recent vision-language models that enable the processing and integration of text, images, and tables into a unified format and representation.
arXiv Detail & Related papers (2024-10-03T17:49:09Z) - Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval [18.058942674792604]
We propose a novel few-shot workflow tailored to the relevant judgment of legal cases.
By comparing the relevance judgments of LLMs and human experts, we empirically show that we can obtain reliable relevance judgments.
arXiv Detail & Related papers (2024-03-27T09:46:56Z) - A Deep Learning-Based System for Automatic Case Summarization [2.9141777969894966]
This paper presents a deep learning-based system for efficient automatic case summarization.
The system offers both supervised and unsupervised methods to generate concise and relevant summaries of lengthy legal case documents.
Future work will focus on refining summarization techniques and exploring the application of our methods to other types of legal texts.
arXiv Detail & Related papers (2023-12-13T01:18:10Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Unified Pretraining Framework for Document Understanding [52.224359498792836]
We present UDoc, a new unified pretraining framework for document understanding.
UDoc is designed to support most document understanding tasks, extending the Transformer to take multimodal embeddings as input.
An important feature of UDoc is that it learns a generic representation by making use of three self-supervised losses.
arXiv Detail & Related papers (2022-04-22T21:47:04Z) - An Evaluation Dataset for Legal Word Embedding: A Case Study On Chinese
Codex [3.1854529627213273]
Word embedding is a modern distributed word representations approach widely used in many natural language processing tasks.
This paper proposes establishing a 1,134 Legal Analogical Reasoning Questions Set (LARQS) from the 2,388 Chinese Codex corpus using five kinds of legal relations.
arXiv Detail & Related papers (2022-03-29T01:26:26Z) - LexGLUE: A Benchmark Dataset for Legal Language Understanding in English [15.026117429782996]
We introduce the Legal General Language Evaluation (LexGLUE) benchmark, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks.
We also provide an evaluation and analysis of several generic and legal-oriented models demonstrating that the latter consistently offer performance improvements across multiple tasks.
arXiv Detail & Related papers (2021-10-03T10:50:51Z) - Summarizing Text on Any Aspects: A Knowledge-Informed Weakly-Supervised
Approach [89.56158561087209]
We study summarizing on arbitrary aspects relevant to the document.
Due to the lack of supervision data, we develop a new weak supervision construction method and an aspect modeling scheme.
Experiments show our approach achieves performance boosts on summarizing both real and synthetic documents.
arXiv Detail & Related papers (2020-10-14T03:20:46Z) - Multilevel Text Alignment with Cross-Document Attention [59.76351805607481]
Existing alignment methods operate at a single, predefined level.
We propose a new learning approach that equips previously established hierarchical attention encoders for representing documents with a cross-document attention component.
arXiv Detail & Related papers (2020-10-03T02:52:28Z) - Building Legal Case Retrieval Systems with Lexical Matching and
Summarization using A Pre-Trained Phrase Scoring Model [1.9275428660922076]
We present our method for tackling the legal case retrieval task of the Competition on Legal Information Extraction/Entailment 2019.
Our approach is based on the idea that summarization is important for retrieval.
We have achieved the state-of-the-art result for the task on the benchmark of the competition.
arXiv Detail & Related papers (2020-09-29T15:10:59Z) - A Survey on Contextual Embeddings [48.04732268018772]
Contextual embeddings assign each word a representation based on its context, capturing uses of words across varied contexts and encoding knowledge that transfers across languages.
We review existing contextual embedding models, cross-lingual polyglot pre-training, the application of contextual embeddings in downstream tasks, model compression, and model analyses.
arXiv Detail & Related papers (2020-03-16T15:22:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.