Can We Automate Scientific Reviewing?
- URL: http://arxiv.org/abs/2102.00176v1
- Date: Sat, 30 Jan 2021 07:16:53 GMT
- Title: Can We Automate Scientific Reviewing?
- Authors: Weizhe Yuan and Pengfei Liu and Graham Neubig
- Abstract summary: We discuss the possibility of using state-of-the-art natural language processing (NLP) models to generate first-pass peer reviews for scientific papers.
We collect a dataset of papers in the machine learning domain, annotate them with different aspects of content covered in each review, and train targeted summarization models that take in papers to generate reviews.
Comprehensive experimental results show that system-generated reviews tend to touch upon more aspects of the paper than human-written reviews.
- Score: 89.50052670307434
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The rapid development of science and technology has been accompanied by an
exponential growth in peer-reviewed scientific publications. At the same time,
the review of each paper is a laborious process that must be carried out by
subject matter experts. Thus, providing high-quality reviews of this growing
number of papers is a significant challenge. In this work, we ask the question
"can we automate scientific reviewing?", discussing the possibility of using
state-of-the-art natural language processing (NLP) models to generate
first-pass peer reviews for scientific papers. Arguably the most difficult part
of this is defining what a "good" review is in the first place, so we first
discuss possible evaluation measures for such reviews. We then collect a
dataset of papers in the machine learning domain, annotate them with different
aspects of content covered in each review, and train targeted summarization
models that take in papers to generate reviews. Comprehensive experimental
results show that system-generated reviews tend to touch upon more aspects of
the paper than human-written reviews, but the generated text can suffer from
lower constructiveness for all aspects except the explanation of the core ideas
of the papers, which are largely factually correct. We finally summarize eight
challenges in the pursuit of a good review generation system together with
potential solutions, which, hopefully, will inspire more future research on
this subject. We make all code, and the dataset publicly available:
https://github.com/neulab/ReviewAdvisor, as well as a ReviewAdvisor system:
http://review.nlpedia.ai/.
Related papers
- Automated Focused Feedback Generation for Scientific Writing Assistance [6.559560602099439]
SWIF$2$T: a Scientific WrIting Focused Feedback Tool.
It is designed to generate specific, actionable and coherent comments, which identify weaknesses in a scientific paper and/or propose revisions to it.
We compile a dataset of 300 peer reviews citing weaknesses in scientific papers and conduct human evaluation.
The results demonstrate the superiority in specificity, reading comprehension, and overall helpfulness of SWIF$2$T's feedback compared to other approaches.
arXiv Detail & Related papers (2024-05-30T20:56:41Z) - What Can Natural Language Processing Do for Peer Review? [173.8912784451817]
In modern science, peer review is widely used, yet it is hard, time-consuming, and prone to error.
Since the artifacts involved in peer review are largely text-based, Natural Language Processing has great potential to improve reviewing.
We detail each step of the process from manuscript submission to camera-ready revision, and discuss the associated challenges and opportunities for NLP assistance.
arXiv Detail & Related papers (2024-05-10T16:06:43Z) - A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence [58.6354685593418]
This paper proposes several article-level, field-normalized, and large language model-empowered bibliometric indicators to evaluate reviews.
The newly emerging AI-generated literature reviews are also appraised.
This work offers insights into the current challenges of literature reviews and envisions future directions for their development.
arXiv Detail & Related papers (2024-02-20T11:28:50Z) - Scientific Opinion Summarization: Paper Meta-review Generation Dataset, Methods, and Evaluation [55.00687185394986]
We propose the task of scientific opinion summarization, where research paper reviews are synthesized into meta-reviews.
We introduce the ORSUM dataset covering 15,062 paper meta-reviews and 57,536 paper reviews from 47 conferences.
Our experiments show that (1) human-written summaries do not always satisfy all necessary criteria such as depth of discussion, and identifying consensus and controversy for the specific domain, and (2) the combination of task decomposition and iterative self-refinement shows strong potential for enhancing the opinions.
arXiv Detail & Related papers (2023-05-24T02:33:35Z) - MOPRD: A multidisciplinary open peer review dataset [12.808751859133064]
Open peer review is a growing trend in academic publications.
Most of the existing peer review datasets do not provide data that cover the whole peer review process.
We construct MOPRD, a multidisciplinary open peer review dataset.
arXiv Detail & Related papers (2022-12-09T16:35:14Z) - Generating Summaries for Scientific Paper Review [29.12631698162247]
The increase of submissions for top venues in machine learning and NLP has caused a problem of excessive burden on reviewers.
An automatic system for assisting with the reviewing process could be a solution for ameliorating the problem.
In this paper, we explore automatic review summary generation for scientific papers.
arXiv Detail & Related papers (2021-09-28T21:43:53Z) - Ranking Scientific Papers Using Preference Learning [48.78161994501516]
We cast it as a paper ranking problem based on peer review texts and reviewer scores.
We introduce a novel, multi-faceted generic evaluation framework for making final decisions based on peer reviews.
arXiv Detail & Related papers (2021-09-02T19:41:47Z) - Polarity in the Classroom: A Case Study Leveraging Peer Sentiment Toward
Scalable Assessment [4.588028371034406]
Accurately grading open-ended assignments in large or massive open online courses (MOOCs) is non-trivial.
In this work, we detail the process by which we create our domain-dependent lexicon and aspect-informed review form.
We end by analyzing validity and discussing conclusions from our corpus of over 6800 peer reviews from nine courses.
arXiv Detail & Related papers (2021-08-02T15:45:11Z) - ReviewRobot: Explainable Paper Review Generation based on Knowledge
Synthesis [62.76038841302741]
We build a novel ReviewRobot to automatically assign a review score and write comments for multiple categories such as novelty and meaningful comparison.
Experimental results show that our review score predictor reaches 71.4%-100% accuracy.
Human assessment by domain experts shows that 41.7%-70.5% of the comments generated by ReviewRobot are valid and constructive, and better than human-written ones for 20% of the time.
arXiv Detail & Related papers (2020-10-13T02:17:58Z) - Understanding Peer Review of Software Engineering Papers [5.744593856232663]
We aim at understanding how reviewers, including those who have won awards for reviewing, perform their reviews of software engineering papers.
The most important features of papers that result in positive reviews are clear and supported validation, an interesting problem, and novelty.
Authors should make the contribution of the work very clear in their paper.
arXiv Detail & Related papers (2020-09-02T17:31:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.