Building Legal Case Retrieval Systems with Lexical Matching and
Summarization using A Pre-Trained Phrase Scoring Model
- URL: http://arxiv.org/abs/2009.14083v1
- Date: Tue, 29 Sep 2020 15:10:59 GMT
- Title: Building Legal Case Retrieval Systems with Lexical Matching and
Summarization using A Pre-Trained Phrase Scoring Model
- Authors: Vu Tran and Minh Le Nguyen and Ken Satoh
- Abstract summary: We present our method for tackling the legal case retrieval task of the Competition on Legal Information Extraction/Entailment 2019.
Our approach is based on the idea that summarization is important for retrieval.
We have achieved the state-of-the-art result for the task on the benchmark of the competition.
- Score: 1.9275428660922076
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present our method for tackling the legal case retrieval task of the
Competition on Legal Information Extraction/Entailment 2019. Our approach is
based on the idea that summarization is important for retrieval. On one hand,
we adopt a summarization based model called encoded summarization which encodes
a given document into continuous vector space which embeds the summary
properties of the document. We utilize the resource of COLIEE 2018 on which we
train the document representation model. On the other hand, we extract lexical
features on different parts of a given query and its candidates. We observe
that by comparing different parts of the query and its candidates, we can
achieve better performance. Furthermore, the combination of the lexical
features with latent features by the summarization-based method achieves even
better performance. We have achieved the state-of-the-art result for the task
on the benchmark of the competition.
Related papers
- Thesis: Document Summarization with applications to Keyword extraction and Image Retrieval [0.0]
We propose a set of submodular functions for opinion summarization.
Opinion summarization has built in it the tasks of summarization and sentiment detection.
Our functions generate summaries such as there is good correlation between document sentiment and summary sentiment along with good ROUGE score.
arXiv Detail & Related papers (2024-05-20T21:27:18Z) - Hierarchical Indexing for Retrieval-Augmented Opinion Summarization [60.5923941324953]
We propose a method for unsupervised abstractive opinion summarization that combines the attributability and scalability of extractive approaches with the coherence and fluency of Large Language Models (LLMs)
Our method, HIRO, learns an index structure that maps sentences to a path through a semantically organized discrete hierarchy.
At inference time, we populate the index and use it to identify and retrieve clusters of sentences containing popular opinions from input reviews.
arXiv Detail & Related papers (2024-03-01T10:38:07Z) - Investigating Consistency in Query-Based Meeting Summarization: A
Comparative Study of Different Embedding Methods [0.0]
Text Summarization is one of famous applications in Natural Language Processing (NLP) field.
It aims to automatically generate summary with important information based on a given context.
In this paper, we are inspired by "QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization" proposed by Microsoft.
We also propose our Locater model designed to extract relevant spans based on given transcript and query, which are then summarized by Summarizer model.
arXiv Detail & Related papers (2024-02-10T08:25:30Z) - UnifieR: A Unified Retriever for Large-Scale Retrieval [84.61239936314597]
Large-scale retrieval is to recall relevant documents from a huge collection given a query.
Recent retrieval methods based on pre-trained language models (PLM) can be coarsely categorized into either dense-vector or lexicon-based paradigms.
We propose a new learning framework, UnifieR which unifies dense-vector and lexicon-based retrieval in one model with a dual-representing capability.
arXiv Detail & Related papers (2022-05-23T11:01:59Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - CODER: An efficient framework for improving retrieval through
COntextualized Document Embedding Reranking [11.635294568328625]
We present a framework for improving the performance of a wide class of retrieval models at minimal computational cost.
It utilizes precomputed document representations extracted by a base dense retrieval method.
It incurs a negligible computational overhead on top of any first-stage method at run time, allowing it to be easily combined with any state-of-the-art dense retrieval method.
arXiv Detail & Related papers (2021-12-16T10:25:26Z) - iFacetSum: Coreference-based Interactive Faceted Summarization for
Multi-Document Exploration [63.272359227081836]
iFacetSum integrates interactive summarization together with faceted search.
Fine-grained facets are automatically produced based on cross-document coreference pipelines.
arXiv Detail & Related papers (2021-09-23T20:01:11Z) - RetrievalSum: A Retrieval Enhanced Framework for Abstractive
Summarization [25.434558112121778]
We propose a novel retrieval enhanced abstractive summarization framework consisting of a dense Retriever and a Summarizer.
We validate our method on a wide range of summarization datasets across multiple domains and two backbone models: BERT and BART.
Results show that our framework obtains significant improvement by 1.384.66 in ROUGE-1 score when compared with the powerful pre-trained models.
arXiv Detail & Related papers (2021-09-16T12:52:48Z) - Legal Search in Case Law and Statute Law [12.697393184074457]
We describe a method to identify document pairwise relevance in the context of a typical legal document collection.
We review the usage of generalized language models, including supervised and unsupervised learning.
arXiv Detail & Related papers (2021-08-23T12:51:24Z) - Text Summarization with Latent Queries [60.468323530248945]
We introduce LaQSum, the first unified text summarization system that learns Latent Queries from documents for abstractive summarization with any existing query forms.
Under a deep generative framework, our system jointly optimize a latent query model and a conditional language model, allowing users to plug-and-play queries of any type at test time.
Our system robustly outperforms strong comparison systems across summarization benchmarks with different query types, document settings, and target domains.
arXiv Detail & Related papers (2021-05-31T21:14:58Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.