Related papers: When Large Language Models Meet Citation: A Survey

When Large Language Models Meet Citation: A Survey

URL: http://arxiv.org/abs/2309.09727v1
Date: Mon, 18 Sep 2023 12:48:48 GMT
Title: When Large Language Models Meet Citation: A Survey
Authors: Yang Zhang, Yufei Wang, Kai Wang, Quan Z. Sheng, Lina Yao, Adnan Mahmood, Wei Emma Zhang and Rongying Zhao
Abstract summary: Large Language Models (LLMs) could be helpful in capturing fine-grained citation information via the corresponding textual context. Citations also establish connections among scientific papers, providing high-quality inter-document relationships. We review the application of LLMs for in-text citation analysis tasks, including citation classification, citation-based summarization, and citation recommendation.
Score: 37.01594297337486
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Citations in scholarly work serve the essential purpose of acknowledging and crediting the original sources of knowledge that have been incorporated or referenced. Depending on their surrounding textual context, these citations are used for different motivations and purposes. Large Language Models (LLMs) could be helpful in capturing these fine-grained citation information via the corresponding textual context, thereby enabling a better understanding towards the literature. Furthermore, these citations also establish connections among scientific papers, providing high-quality inter-document relationships and human-constructed knowledge. Such information could be incorporated into LLMs pre-training and improve the text representation in LLMs. Therefore, in this paper, we offer a preliminary review of the mutually beneficial relationship between LLMs and citation analysis. Specifically, we review the application of LLMs for in-text citation analysis tasks, including citation classification, citation-based summarization, and citation recommendation. We then summarize the research pertinent to leveraging citation linkage knowledge to improve text representations of LLMs via citation prediction, network structure information, and inter-document relationship. We finally provide an overview of these contemporary methods and put forth potential promising avenues in combining LLMs and citation analysis for further investigation.

Related papers

How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices? [1.130790932059036]
We show that large language models (LLMs) reinforce the Matthew effect in citations by consistently favoring highly cited papers. We analyze 274,951 references generated by GPT-4o for 10,000 papers.
arXiv Detail & Related papers (2025-04-03T17:04:56Z)
HLM-Cite: Hybrid Language Model Workflow for Text-based Scientific Citation Prediction [14.731720495144112]
We introduce the novel concept of core citation, which identifies the critical references that go beyond superficial mentions. We propose $textbfHLM-Cite, a $textbfH$ybrid $textbfL$anguage $textbfM$odel workflow for citation prediction. We evaluate HLM-Cite across 19 scientific fields, demonstrating a 17.6% performance improvement comparing SOTA methods.
arXiv Detail & Related papers (2024-10-10T10:46:06Z)
Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
RAG has been widely adopted to enhance Large Language Models (LLMs) Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG. This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step.
arXiv Detail & Related papers (2024-07-01T20:47:47Z)
Verifiable Generation with Subsentence-Level Fine-Grained Citations [13.931548733211436]
Verifiable generation requires large language models to cite source documents supporting their outputs. Previous work mainly targets the generation of sentence-level citations, lacking specificity about which parts of a sentence are backed by the cited sources. This work studies verifiable generation with subsentence-level fine-grained citations for more precise location of generated content supported by the cited sources.
arXiv Detail & Related papers (2024-06-10T09:32:37Z)
Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias [1.7812428873698407]
Citation practices are crucial in shaping the structure of scientific knowledge, yet they are often influenced by contemporary norms and biases. The emergence of Large Language Models (LLMs) introduces a new dynamic to these practices. Here, we analyze these characteristics in an experiment using a dataset from AAAI, NeurIPS, ICML, and ICLR.
arXiv Detail & Related papers (2024-05-24T17:34:32Z)
Context-Enhanced Language Models for Generating Multi-Paper Citations [35.80247519023821]
We propose a method that leverages Large Language Models (LLMs) to generate multi-citation sentences. Our approach involves a single source paper and a collection of target papers, culminating in a coherent paragraph containing multi-sentence citation text.
arXiv Detail & Related papers (2024-04-22T04:30:36Z)
Source-Aware Training Enables Knowledge Attribution in Language Models [81.13048060332775]
Intrinsic source citation can enhance transparency, interpretability, and verifiability. Our training recipe can enable faithful attribution to the pretraining data without a substantial impact on the model's perplexity.
arXiv Detail & Related papers (2024-04-01T09:39:38Z)
Generative Context-aware Fine-tuning of Self-supervised Speech Models [54.389711404209415]
We study the use of generative large language models (LLM) generated context information. We propose an approach to distill the generated information during fine-tuning of self-supervised speech models. We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks: automatic speech recognition, named entity recognition, and sentiment analysis.
arXiv Detail & Related papers (2023-12-15T15:46:02Z)
Effective Large Language Model Adaptation for Improved Grounding and Citation Generation [48.07830615309543]
This paper focuses on improving large language models (LLMs) by grounding their responses in retrieved passages and by providing citations. We propose a new framework, AGREE, that improves the grounding from a holistic perspective. Our framework tunes LLMs to selfground the claims in their responses and provide accurate citations to retrieved documents.
arXiv Detail & Related papers (2023-11-16T03:22:25Z)
Enabling Large Language Models to Generate Text with Citations [37.64884969997378]
Large language models (LLMs) have emerged as a widely-used tool for information seeking. Our aim is to allow LLMs to generate text with citations, improving their factual correctness and verifiability. We propose ALCE, the first benchmark for Automatic LLMs' Citation Evaluation.
arXiv Detail & Related papers (2023-05-24T01:53:49Z)
CitationIE: Leveraging the Citation Graph for Scientific Information Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers. We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.