An Anchor Learning Approach for Citation Field Learning
- URL: http://arxiv.org/abs/2309.03559v2
- Date: Thu, 14 Dec 2023 12:58:11 GMT
- Title: An Anchor Learning Approach for Citation Field Learning
- Authors: Zilin Yuan, Borun Chen, Yimeng Dai, Yinghui Li, Hai-Tao Zheng, Rui
Zhang
- Abstract summary: We propose a novel algorithm, CIFAL, to boost the citation field learning performance.
Experiments demonstrate that CIFAL outperforms state-of-the-art methods in citation field learning.
- Score: 23.507104046870186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Citation field learning is to segment a citation string into fields of
interest such as author, title, and venue. Extracting such fields from
citations is crucial for citation indexing, researcher profile analysis, etc.
User-generated resources like academic homepages and Curriculum Vitae, provide
rich citation field information. However, extracting fields from these
resources is challenging due to inconsistent citation styles, incomplete
sentence syntax, and insufficient training data. To address these challenges,
we propose a novel algorithm, CIFAL (citation field learning by anchor
learning), to boost the citation field learning performance. CIFAL leverages
the anchor learning, which is model-agnostic for any Pre-trained Language
Model, to help capture citation patterns from the data of different citation
styles. The experiments demonstrate that CIFAL outperforms state-of-the-art
methods in citation field learning, achieving a 2.68% improvement in
field-level F1-scores. Extensive analysis of the results further confirms the
effectiveness of CIFAL quantitatively and qualitatively.
Related papers
- ALiiCE: Evaluating Positional Fine-grained Citation Generation [54.19617927314975]
We propose ALiiCE, the first automatic evaluation framework for fine-grained citation generation.
Our framework first parses the sentence claim into atomic claims via dependency analysis and then calculates citation quality at the atomic claim level.
We evaluate the positional fine-grained citation generation performance of several Large Language Models on two long-form QA datasets.
arXiv Detail & Related papers (2024-06-19T09:16:14Z) - Context-Enhanced Language Models for Generating Multi-Paper Citations [35.80247519023821]
We propose a method that leverages Large Language Models (LLMs) to generate multi-citation sentences.
Our approach involves a single source paper and a collection of target papers, culminating in a coherent paragraph containing multi-sentence citation text.
arXiv Detail & Related papers (2024-04-22T04:30:36Z) - Source-Aware Training Enables Knowledge Attribution in Language Models [81.13048060332775]
Intrinsic source citation can enhance transparency, interpretability, and verifiability.
Our training recipe can enable faithful attribution to the pretraining data without a substantial impact on the model's perplexity.
arXiv Detail & Related papers (2024-04-01T09:39:38Z) - WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations [34.99831757956635]
We formulate the task of attributed query-focused summarization (AQFS) and present WebCiteS, a Chinese dataset featuring 7k human-annotated summaries with citations.
We tackle these issues by developing detailed metrics and enabling the automatic evaluator to decompose the sentences into sub-claims for fine-grained verification.
arXiv Detail & Related papers (2024-03-04T07:06:41Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Prefer to Classify: Improving Text Classifiers via Auxiliary Preference
Learning [76.43827771613127]
In this paper, we investigate task-specific preferences between pairs of input texts as a new alternative way for such auxiliary data annotation.
We propose a novel multi-task learning framework, called prefer-to-classify (P2C), which can enjoy the cooperative effect of learning both the given classification task and the auxiliary preferences.
arXiv Detail & Related papers (2023-06-08T04:04:47Z) - Inline Citation Classification using Peripheral Context and
Time-evolving Augmentation [23.88211560188731]
We propose a new dataset, named 3Cext, which provides discourse information using the cited sentences.
We propose PeriCite, a Transformer-based deep neural network that fuses peripheral sentences and domain knowledge.
arXiv Detail & Related papers (2023-03-01T09:11:07Z) - Deep Graph Learning for Anomalous Citation Detection [55.81334139806342]
We propose a novel deep graph learning model, namely GLAD (Graph Learning for Anomaly Detection), to identify anomalies in citation networks.
Within the GLAD framework, we propose an algorithm called CPU (Citation PUrpose) to discover the purpose of citation based on citation texts.
arXiv Detail & Related papers (2022-02-23T09:05:28Z) - Knowledge-Rich BERT Embeddings for Readability Assessment [0.0]
We propose an alternative way of utilizing the information-rich embeddings of BERT models through a joint-learning method.
Results show that the proposed method outperforms classical approaches in readability assessment using English and Filipino datasets.
arXiv Detail & Related papers (2021-06-15T07:37:48Z) - Context-Based Quotation Recommendation [60.93257124507105]
We propose a novel context-aware quote recommendation system.
It generates a ranked list of quotable paragraphs and spans of tokens from a given source document.
We conduct experiments on a collection of speech transcripts and associated news articles.
arXiv Detail & Related papers (2020-05-17T17:49:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.