Citation Recommendation on Scholarly Legal Articles
- URL: http://arxiv.org/abs/2311.05902v1
- Date: Fri, 10 Nov 2023 07:11:55 GMT
- Title: Citation Recommendation on Scholarly Legal Articles
- Authors: Do\u{g}ukan Arslan, Saadet Sena Erdo\u{g}an and G\"ul\c{s}en
Eryi\u{g}it
- Abstract summary: Citation recommendation is used within the legal domain to identify supporting arguments.
BM25 is a strong benchmark for the legal citation recommendation task.
Fine-tuning leads to considerable performance increases in pre-trained models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Citation recommendation is the task of finding appropriate citations based on
a given piece of text. The proposed datasets for this task consist mainly of
several scientific fields, lacking some core ones, such as law. Furthermore,
citation recommendation is used within the legal domain to identify supporting
arguments, utilizing non-scholarly legal articles. In order to alleviate the
limitations of existing studies, we gather the first scholarly legal dataset
for the task of citation recommendation. Also, we conduct experiments with
state-of-the-art models and compare their performance on this dataset. The
study suggests that, while BM25 is a strong benchmark for the legal citation
recommendation task, the most effective method involves implementing a two-step
process that entails pre-fetching with BM25+, followed by re-ranking with
SciNCL, which enhances the performance of the baseline from 0.26 to 0.30
MAP@10. Moreover, fine-tuning leads to considerable performance increases in
pre-trained models, which shows the importance of including legal articles in
the training data of these models.
Related papers
- From Words to Worth: Newborn Article Impact Prediction with LLM [69.41680520058418]
This paper introduces a promising approach, leveraging the capabilities of fine-tuned LLMs to predict the future impact of newborn articles.
A comprehensive dataset has been constructed and released for fine-tuning the LLM, containing over 12,000 entries with corresponding titles, abstracts, and TNCSI_SP.
arXiv Detail & Related papers (2024-08-07T17:52:02Z) - CLERC: A Dataset for Legal Case Retrieval and Retrieval-Augmented Analysis Generation [44.67578050648625]
We transform a large open-source legal corpus into a dataset supporting information retrieval (IR) and retrieval-augmented generation (RAG)
This dataset CLERC is constructed for training and evaluating models on their ability to (1) find corresponding citations for a given piece of legal analysis and to (2) compile the text of these citations into a cogent analysis that supports a reasoning goal.
arXiv Detail & Related papers (2024-06-24T23:57:57Z) - ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation [31.259805200946175]
We introduce the evidence-grounded local citation recommendation task, where the target latent space comprises evidence spans for recommending specific papers.
Unlike past formulations that simply output recommendations, ILCiteR retrieves ranked lists of evidence span and recommended paper pairs.
We contribute a novel dataset for the evidence-grounded local citation recommendation task and demonstrate the efficacy of our proposed conditional neural rank-ensembling approach for re-ranking evidence spans.
arXiv Detail & Related papers (2024-03-13T17:38:05Z) - Combining topic modelling and citation network analysis to study case
law from the European Court on Human Rights on the right to respect for
private and family life [0.0]
This paper focuses on case law from the European Court of Human Rights on Article 8 of the European Convention of Human Rights.
We demonstrate and compare the potential of topic modelling and citation network to find and organize case law on Article 8.
We evaluate the effectiveness of the combined method on a manually collected and annotated dataset of Aricle 8 case law on evictions.
arXiv Detail & Related papers (2024-01-19T14:30:35Z) - SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim
Verification on Scientific Tables [68.76415918462418]
We present SCITAB, a challenging evaluation dataset consisting of 1.2K expert-verified scientific claims.
Through extensive evaluations, we demonstrate that SCITAB poses a significant challenge to state-of-the-art models.
Our analysis uncovers several unique challenges posed by SCITAB, including table grounding, claim ambiguity, and compositional reasoning.
arXiv Detail & Related papers (2023-05-22T16:13:50Z) - Tag-Aware Document Representation for Research Paper Recommendation [68.8204255655161]
We propose a hybrid approach that leverages deep semantic representation of research papers based on social tags assigned by users.
The proposed model is effective in recommending research papers even when the rating data is very sparse.
arXiv Detail & Related papers (2022-09-08T09:13:07Z) - Yes-Yes-Yes: Donation-based Peer Reviewing Data Collection for ACL
Rolling Review and Beyond [58.71736531356398]
We present an in-depth discussion of peer reviewing data, outline the ethical and legal desiderata for peer reviewing data collection, and propose the first continuous, donation-based data collection workflow.
We report on the ongoing implementation of this workflow at the ACL Rolling Review and deliver the first insights obtained with the newly collected data.
arXiv Detail & Related papers (2022-01-27T11:02:43Z) - Evaluating Document Representations for Content-based Legal Literature
Recommendations [6.4815284696225905]
Legal recommender systems are typically evaluated in small-scale user study without any public available benchmark datasets.
We evaluate text-based (e.g., fastText, Transformers), citation-based (e.g., DeepWalk, Poincar'e), and hybrid methods.
Our experiments show that document representations from averaged fastText word vectors (trained on legal corpora) yield the best results.
arXiv Detail & Related papers (2021-04-28T15:48:19Z) - Learning Fine-grained Fact-Article Correspondence in Legal Cases [19.606628325747938]
We create a corpus with manually annotated fact-article correspondences.
We parse articles in form of premise-conclusion pairs with random forest.
Our best system reaches an F1 score of 96.3%, making it of great potential for practical use.
arXiv Detail & Related papers (2021-04-21T19:06:58Z) - Enhancing Scientific Papers Summarization with Citation Graph [78.65955304229863]
We redefine the task of scientific papers summarization by utilizing their citation graph.
We construct a novel scientific papers summarization dataset Semantic Scholar Network (SSN) which contains 141K research papers in different domains.
Our model can achieve competitive performance when compared with the pretrained models.
arXiv Detail & Related papers (2021-04-07T11:13:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.