Reviewer2: Optimizing Review Generation Through Prompt Generation
- URL: http://arxiv.org/abs/2402.10886v1
- Date: Fri, 16 Feb 2024 18:43:10 GMT
- Title: Reviewer2: Optimizing Review Generation Through Prompt Generation
- Authors: Zhaolin Gao, Kiant\'e Brantley, Thorsten Joachims
- Abstract summary: We propose an efficient two-stage review generation framework called Reviewer2.
Unlike prior work, this approach explicitly models the distribution of possible aspects that the review may address.
We generate a large-scale review dataset of 27k papers and 99k reviews that we annotate with aspect prompts.
- Score: 27.379753994272875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent developments in LLMs offer new opportunities for assisting authors in
improving their work. In this paper, we envision a use case where authors can
receive LLM-generated reviews that uncover weak points in the current draft.
While initial methods for automated review generation already exist, these
methods tend to produce reviews that lack detail, and they do not cover the
range of opinions that human reviewers produce. To address this shortcoming, we
propose an efficient two-stage review generation framework called Reviewer2.
Unlike prior work, this approach explicitly models the distribution of possible
aspects that the review may address. We show that this leads to more detailed
reviews that better cover the range of aspects that human reviewers identify in
the draft. As part of the research, we generate a large-scale review dataset of
27k papers and 99k reviews that we annotate with aspect prompts, which we make
available as a resource for future research.
Related papers
- AI-Driven Review Systems: Evaluating LLMs in Scalable and Bias-Aware Academic Reviews [18.50142644126276]
We evaluate the alignment of automatic paper reviews with human reviews using an arena of human preferences by pairwise comparisons.
We fine-tune an LLM to predict human preferences, predicting which reviews humans will prefer in a head-to-head battle between LLMs.
We make the reviews of publicly available arXiv and open-access Nature journal papers available online, along with a free service which helps authors review and revise their research papers and improve their quality.
arXiv Detail & Related papers (2024-08-19T19:10:38Z) - Review-LLM: Harnessing Large Language Models for Personalized Review Generation [8.898103706804616]
Large Language Models (LLMs) have shown superior text modeling and generating ability.
We propose Review-LLM that customizes LLMs for personalized review generation.
arXiv Detail & Related papers (2024-07-10T09:22:19Z) - LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing [106.45895712717612]
Large language models (LLMs) have shown remarkable versatility in various generative tasks.
This study focuses on the topic of LLMs assist NLP Researchers.
To our knowledge, this is the first work to provide such a comprehensive analysis.
arXiv Detail & Related papers (2024-06-24T01:30:22Z) - A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence [58.6354685593418]
This paper proposes several article-level, field-normalized, and large language model-empowered bibliometric indicators to evaluate reviews.
The newly emerging AI-generated literature reviews are also appraised.
This work offers insights into the current challenges of literature reviews and envisions future directions for their development.
arXiv Detail & Related papers (2024-02-20T11:28:50Z) - LLMRefine: Pinpointing and Refining Large Language Models via Fine-Grained Actionable Feedback [65.84061725174269]
Recent large language models (LLM) are leveraging human feedback to improve their generation quality.
We propose LLMRefine, an inference time optimization method to refine LLM's output.
We conduct experiments on three text generation tasks, including machine translation, long-form question answering (QA), and topical summarization.
LLMRefine consistently outperforms all baseline approaches, achieving improvements up to 1.7 MetricX points on translation tasks, 8.1 ROUGE-L on ASQA, 2.2 ROUGE-L on topical summarization.
arXiv Detail & Related papers (2023-11-15T19:52:11Z) - Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z) - Towards Personalized Review Summarization by Modeling Historical Reviews
from Customer and Product Separately [59.61932899841944]
Review summarization is a non-trivial task that aims to summarize the main idea of the product review in the E-commerce website.
We propose the Heterogeneous Historical Review aware Review Summarization Model (HHRRS)
We employ a multi-task framework that conducts the review sentiment classification and summarization jointly.
arXiv Detail & Related papers (2023-01-27T12:32:55Z) - Can We Automate Scientific Reviewing? [89.50052670307434]
We discuss the possibility of using state-of-the-art natural language processing (NLP) models to generate first-pass peer reviews for scientific papers.
We collect a dataset of papers in the machine learning domain, annotate them with different aspects of content covered in each review, and train targeted summarization models that take in papers to generate reviews.
Comprehensive experimental results show that system-generated reviews tend to touch upon more aspects of the paper than human-written reviews.
arXiv Detail & Related papers (2021-01-30T07:16:53Z) - How Useful are Reviews for Recommendation? A Critical Review and
Potential Improvements [8.471274313213092]
We investigate a growing body of work that seeks to improve recommender systems through the use of review text.
Our initial findings reveal several discrepancies in reported results, partly due to copying results across papers despite changes in experimental settings or data pre-processing.
Further investigation calls for discussion on a much larger problem about the "importance" of user reviews for recommendation.
arXiv Detail & Related papers (2020-05-25T16:30:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.