When Large Language Models Meet Citation: A Survey
- URL: http://arxiv.org/abs/2309.09727v1
- Date: Mon, 18 Sep 2023 12:48:48 GMT
- Title: When Large Language Models Meet Citation: A Survey
- Authors: Yang Zhang, Yufei Wang, Kai Wang, Quan Z. Sheng, Lina Yao, Adnan
Mahmood, Wei Emma Zhang and Rongying Zhao
- Abstract summary: Large Language Models (LLMs) could be helpful in capturing fine-grained citation information via the corresponding textual context.
Citations also establish connections among scientific papers, providing high-quality inter-document relationships.
We review the application of LLMs for in-text citation analysis tasks, including citation classification, citation-based summarization, and citation recommendation.
- Score: 37.01594297337486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Citations in scholarly work serve the essential purpose of acknowledging and
crediting the original sources of knowledge that have been incorporated or
referenced. Depending on their surrounding textual context, these citations are
used for different motivations and purposes. Large Language Models (LLMs) could
be helpful in capturing these fine-grained citation information via the
corresponding textual context, thereby enabling a better understanding towards
the literature. Furthermore, these citations also establish connections among
scientific papers, providing high-quality inter-document relationships and
human-constructed knowledge. Such information could be incorporated into LLMs
pre-training and improve the text representation in LLMs. Therefore, in this
paper, we offer a preliminary review of the mutually beneficial relationship
between LLMs and citation analysis. Specifically, we review the application of
LLMs for in-text citation analysis tasks, including citation classification,
citation-based summarization, and citation recommendation. We then summarize
the research pertinent to leveraging citation linkage knowledge to improve text
representations of LLMs via citation prediction, network structure information,
and inter-document relationship. We finally provide an overview of these
contemporary methods and put forth potential promising avenues in combining
LLMs and citation analysis for further investigation.
Related papers
- HLM-Cite: Hybrid Language Model Workflow for Text-based Scientific Citation Prediction [14.731720495144112]
We introduce the novel concept of core citation, which identifies the critical references that go beyond superficial mentions.
We propose $textbfHLM-Cite, a $textbfH$ybrid $textbfL$anguage $textbfM$odel workflow for citation prediction.
We evaluate HLM-Cite across 19 scientific fields, demonstrating a 17.6% performance improvement comparing SOTA methods.
arXiv Detail & Related papers (2024-10-10T10:46:06Z) - Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
RAG has been widely adopted to enhance Large Language Models (LLMs)
Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG.
This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step.
arXiv Detail & Related papers (2024-07-01T20:47:47Z) - Verifiable Generation with Subsentence-Level Fine-Grained Citations [13.931548733211436]
Verifiable generation requires large language models to cite source documents supporting their outputs.
Previous work mainly targets the generation of sentence-level citations, lacking specificity about which parts of a sentence are backed by the cited sources.
This work studies verifiable generation with subsentence-level fine-grained citations for more precise location of generated content supported by the cited sources.
arXiv Detail & Related papers (2024-06-10T09:32:37Z) - Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias [1.7812428873698407]
Citation practices are crucial in shaping the structure of scientific knowledge, yet they are often influenced by contemporary norms and biases.
The emergence of Large Language Models (LLMs) introduces a new dynamic to these practices.
Here, we analyze these characteristics in an experiment using a dataset from AAAI, NeurIPS, ICML, and ICLR.
arXiv Detail & Related papers (2024-05-24T17:34:32Z) - Context-Enhanced Language Models for Generating Multi-Paper Citations [35.80247519023821]
We propose a method that leverages Large Language Models (LLMs) to generate multi-citation sentences.
Our approach involves a single source paper and a collection of target papers, culminating in a coherent paragraph containing multi-sentence citation text.
arXiv Detail & Related papers (2024-04-22T04:30:36Z) - Source-Aware Training Enables Knowledge Attribution in Language Models [81.13048060332775]
Intrinsic source citation can enhance transparency, interpretability, and verifiability.
Our training recipe can enable faithful attribution to the pretraining data without a substantial impact on the model's perplexity.
arXiv Detail & Related papers (2024-04-01T09:39:38Z) - Generative Context-aware Fine-tuning of Self-supervised Speech Models [54.389711404209415]
We study the use of generative large language models (LLM) generated context information.
We propose an approach to distill the generated information during fine-tuning of self-supervised speech models.
We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks: automatic speech recognition, named entity recognition, and sentiment analysis.
arXiv Detail & Related papers (2023-12-15T15:46:02Z) - Effective Large Language Model Adaptation for Improved Grounding and Citation Generation [48.07830615309543]
This paper focuses on improving large language models (LLMs) by grounding their responses in retrieved passages and by providing citations.
We propose a new framework, AGREE, that improves the grounding from a holistic perspective.
Our framework tunes LLMs to selfground the claims in their responses and provide accurate citations to retrieved documents.
arXiv Detail & Related papers (2023-11-16T03:22:25Z) - Enabling Large Language Models to Generate Text with Citations [37.64884969997378]
Large language models (LLMs) have emerged as a widely-used tool for information seeking.
Our aim is to allow LLMs to generate text with citations, improving their factual correctness and verifiability.
We propose ALCE, the first benchmark for Automatic LLMs' Citation Evaluation.
arXiv Detail & Related papers (2023-05-24T01:53:49Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.