CORWA: A Citation-Oriented Related Work Annotation Dataset
- URL: http://arxiv.org/abs/2205.03512v1
- Date: Sat, 7 May 2022 00:23:46 GMT
- Title: CORWA: A Citation-Oriented Related Work Annotation Dataset
- Authors: Xiangci Li, Biswadip Mandal, Jessica Ouyang
- Abstract summary: In natural language processing, literature reviews are usually conducted under the "Related Work" section.
We train a strong baseline model that automatically tags the CORWA labels on massive unlabeled related work section texts.
We suggest a novel framework for human-in-the-loop, iterative, abstractive related work generation.
- Score: 4.740962650068886
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Academic research is an exploratory activity to discover new solutions to
problems. By this nature, academic research works perform literature reviews to
distinguish their novelties from prior work. In natural language processing,
this literature review is usually conducted under the "Related Work" section.
The task of related work generation aims to automatically generate the related
work section given the rest of the research paper and a list of papers to cite.
Prior work on this task has focused on the sentence as the basic unit of
generation, neglecting the fact that related work sections consist of variable
length text fragments derived from different information sources. As a first
step toward a linguistically-motivated related work generation framework, we
present a Citation Oriented Related Work Annotation (CORWA) dataset that labels
different types of citation text fragments from different information sources.
We train a strong baseline model that automatically tags the CORWA labels on
massive unlabeled related work section texts. We further suggest a novel
framework for human-in-the-loop, iterative, abstractive related work
generation.
Related papers
- CiteBench: A benchmark for Scientific Citation Text Generation [69.37571393032026]
CiteBench is a benchmark for citation text generation.
We make the code for CiteBench publicly available at https://github.com/UKPLab/citebench.
arXiv Detail & Related papers (2022-12-19T16:10:56Z) - Target-aware Abstractive Related Work Generation with Contrastive
Learning [48.02845973891943]
The related work section is an important component of a scientific paper, which highlights the contribution of the target paper in the context of the reference papers.
Most of the existing related work section generation methods rely on extracting off-the-shelf sentences.
We propose an abstractive target-aware related work generator (TAG), which can generate related work sections consisting of new sentences.
arXiv Detail & Related papers (2022-05-26T13:20:51Z) - Revise and Resubmit: An Intertextual Model of Text-based Collaboration
in Peer Review [52.359007622096684]
Peer review is a key component of the publishing process in most fields of science.
Existing NLP studies focus on the analysis of individual texts.
editorial assistance often requires modeling interactions between pairs of texts.
arXiv Detail & Related papers (2022-04-22T16:39:38Z) - Automatic Related Work Generation: A Meta Study [5.025654873456755]
In natural language processing, a literature review is usually conducted under the "Related Work" section.
The task of automatic related work generation aims to automatically generate the "Related Work" section.
We conduct a meta-study to compare the existing literature on related work generation from the perspectives of problem formulation, dataset collection, methodological approach, performance evaluation, and future prospects.
arXiv Detail & Related papers (2022-01-06T01:16:38Z) - MultiCite: Modeling realistic citations requires moving beyond the
single-sentence single-label setting [13.493267499658527]
We release MultiCite, a new dataset of 12,653 citation contexts from over 1,200 computational linguistics papers.
We show how our dataset, while still usable for training classic CCA models, also supports the development of new types of models for CCA beyond fixed-width text classification.
arXiv Detail & Related papers (2021-07-01T12:54:23Z) - Generating Related Work [37.161925758727456]
We model generating related work sections while being cognisant of the motivation behind citing papers.
Our model outperforms several strong state-of-the-art summarization and multi-document summarization models.
arXiv Detail & Related papers (2021-04-18T00:19:37Z) - What's New? Summarizing Contributions in Scientific Literature [85.95906677964815]
We introduce a new task of disentangled paper summarization, which seeks to generate separate summaries for the paper contributions and the context of the work.
We extend the S2ORC corpus of academic articles by adding disentangled "contribution" and "context" reference labels.
We propose a comprehensive automatic evaluation protocol which reports the relevance, novelty, and disentanglement of generated outputs.
arXiv Detail & Related papers (2020-11-06T02:23:01Z) - Positioning yourself in the maze of Neural Text Generation: A
Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc.
We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z) - From Standard Summarization to New Tasks and Beyond: Summarization with
Manifold Information [77.89755281215079]
Text summarization is the research area aiming at creating a short and condensed version of the original document.
In real-world applications, most of the data is not in a plain text format.
This paper focuses on the survey of these new summarization tasks and approaches in the real-world application.
arXiv Detail & Related papers (2020-05-10T14:59:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.