Learning Neural Textual Representations for Citation Recommendation
- URL: http://arxiv.org/abs/2007.04070v1
- Date: Wed, 8 Jul 2020 12:38:50 GMT
- Title: Learning Neural Textual Representations for Citation Recommendation
- Authors: Binh Thanh Kieu, Inigo Jauregi Unanue, Son Bao Pham, Hieu Xuan Phan,
Massimo Piccardi
- Abstract summary: We propose a novel approach to citation recommendation using a deep sequential representation of the documents (Sentence-BERT) cascaded with Siamese and triplet networks in a submodular scoring function.
To the best of our knowledge, this is the first approach to combine deep representations and submodular selection for a task of citation recommendation.
- Score: 7.227232362460348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid growth of the scientific literature, manually selecting
appropriate citations for a paper is becoming increasingly challenging and
time-consuming. While several approaches for automated citation recommendation
have been proposed in the recent years, effective document representations for
citation recommendation are still elusive to a large extent. For this reason,
in this paper we propose a novel approach to citation recommendation which
leverages a deep sequential representation of the documents (Sentence-BERT)
cascaded with Siamese and triplet networks in a submodular scoring function. To
the best of our knowledge, this is the first approach to combine deep
representations and submodular selection for a task of citation recommendation.
Experiments have been carried out using a popular benchmark dataset - the ACL
Anthology Network corpus - and evaluated against baselines and a
state-of-the-art approach using metrics such as the MRR and F1-at-k score. The
results show that the proposed approach has been able to outperform all the
compared approaches in every measured metric.
Related papers
- Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
RAG has been widely adopted to enhance Large Language Models (LLMs)
Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG.
This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step.
arXiv Detail & Related papers (2024-07-01T20:47:47Z) - ALiiCE: Evaluating Positional Fine-grained Citation Generation [54.19617927314975]
We propose ALiiCE, the first automatic evaluation framework for fine-grained citation generation.
Our framework first parses the sentence claim into atomic claims via dependency analysis and then calculates citation quality at the atomic claim level.
We evaluate the positional fine-grained citation generation performance of several Large Language Models on two long-form QA datasets.
arXiv Detail & Related papers (2024-06-19T09:16:14Z) - ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation [31.259805200946175]
We introduce the evidence-grounded local citation recommendation task, where the target latent space comprises evidence spans for recommending specific papers.
Unlike past formulations that simply output recommendations, ILCiteR retrieves ranked lists of evidence span and recommended paper pairs.
We contribute a novel dataset for the evidence-grounded local citation recommendation task and demonstrate the efficacy of our proposed conditional neural rank-ensembling approach for re-ranking evidence spans.
arXiv Detail & Related papers (2024-03-13T17:38:05Z) - Tag-Aware Document Representation for Research Paper Recommendation [68.8204255655161]
We propose a hybrid approach that leverages deep semantic representation of research papers based on social tags assigned by users.
The proposed model is effective in recommending research papers even when the rating data is very sparse.
arXiv Detail & Related papers (2022-09-08T09:13:07Z) - Recommending Multiple Positive Citations for Manuscript via
Content-Dependent Modeling and Multi-Positive Triplet [6.7854900381386845]
We propose a novel scientific paper modeling for citation recommendations, namely Multi-Positive BERT Model for Citation Recommendation (MP-BERT4CR)
The proposed approach has the following advantages: First, the proposed multi-positive objectives are effective to recommend multiple positive candidates.
MP-BERT4CR are also effective in retrieving the full list of co-citations, and historically low-frequent co-citation pairs compared with the prior works.
arXiv Detail & Related papers (2021-11-25T04:09:31Z) - Evaluating Document Representations for Content-based Legal Literature
Recommendations [6.4815284696225905]
Legal recommender systems are typically evaluated in small-scale user study without any public available benchmark datasets.
We evaluate text-based (e.g., fastText, Transformers), citation-based (e.g., DeepWalk, Poincar'e), and hybrid methods.
Our experiments show that document representations from averaged fastText word vectors (trained on legal corpora) yield the best results.
arXiv Detail & Related papers (2021-04-28T15:48:19Z) - Enhancing Scientific Papers Summarization with Citation Graph [78.65955304229863]
We redefine the task of scientific papers summarization by utilizing their citation graph.
We construct a novel scientific papers summarization dataset Semantic Scholar Network (SSN) which contains 141K research papers in different domains.
Our model can achieve competitive performance when compared with the pretrained models.
arXiv Detail & Related papers (2021-04-07T11:13:35Z) - Virtual Proximity Citation (VCP): A Supervised Deep Learning Method to
Relate Uncited Papers On Grounds of Citation Proximity [0.0]
This paper discusses the approach Virtual Citation Proximity (VCP)
The actual distance between the two citations in a document is used as ground truth.
This can be used to calculate relatedness between two documents in a way they would have been cited in the proximity even if the documents are uncited.
arXiv Detail & Related papers (2020-09-25T12:24:00Z) - SummPip: Unsupervised Multi-Document Summarization with Sentence Graph
Compression [61.97200991151141]
SummPip is an unsupervised method for multi-document summarization.
We convert the original documents to a sentence graph, taking both linguistic and deep representation into account.
We then apply spectral clustering to obtain multiple clusters of sentences, and finally compress each cluster to generate the final summary.
arXiv Detail & Related papers (2020-07-17T13:01:15Z) - Context-Based Quotation Recommendation [60.93257124507105]
We propose a novel context-aware quote recommendation system.
It generates a ranked list of quotable paragraphs and spans of tokens from a given source document.
We conduct experiments on a collection of speech transcripts and associated news articles.
arXiv Detail & Related papers (2020-05-17T17:49:53Z) - HybridCite: A Hybrid Model for Context-Aware Citation Recommendation [0.0]
We develop citation recommendation approaches based on embeddings, topic modeling, and information retrieval techniques.
We combine, for the first time to the best of our knowledge, the best-performing algorithms into a semi-genetic hybrid recommender system.
Our evaluation results show that a hybrid model containing embedding and information retrieval-based components outperforms its individual components and further algorithms by a large margin.
arXiv Detail & Related papers (2020-02-15T16:19:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.