Related papers: SurveyX: Academic Survey Automation via Large Language Models

SurveyX: Academic Survey Automation via Large Language Models

URL: http://arxiv.org/abs/2502.14776v1
Date: Thu, 20 Feb 2025 17:59:45 GMT
Title: SurveyX: Academic Survey Automation via Large Language Models
Authors: Xun Liang, Jiawei Yang, Yezhaohui Wang, Chen Tang, Zifan Zheng, Simin Niu, Shichao Song, Hanyu Wang, Bo Tang, Feiyu Xiong, Keming Mao, Zhiyu li,
Abstract summary: SurveyX is an efficient and organized system for automated survey generation.<n>It decomposes the survey composing process into two phases: Preparation and Generation.<n>It significantly enhances the efficacy of survey composition.
Score: 23.142476414919653
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have demonstrated exceptional comprehension capabilities and a vast knowledge base, suggesting that LLMs can serve as efficient tools for automated survey generation. However, recent research related to automated survey generation remains constrained by some critical limitations like finite context window, lack of in-depth content discussion, and absence of systematic evaluation frameworks. Inspired by human writing processes, we propose SurveyX, an efficient and organized system for automated survey generation that decomposes the survey composing process into two phases: the Preparation and Generation phases. By innovatively introducing online reference retrieval, a pre-processing method called AttributeTree, and a re-polishing process, SurveyX significantly enhances the efficacy of survey composition. Experimental evaluation results show that SurveyX outperforms existing automated survey generation systems in content quality (0.259 improvement) and citation quality (1.76 enhancement), approaching human expert performance across multiple evaluation dimensions. Examples of surveys generated by SurveyX are available on www.surveyx.cn

Related papers

Impact of a Deployed LLM Survey Creation Tool through the IS Success Model [6.522798387883815]
This paper presents the real-world deployment of an LLM-powered system designed to accelerate data collection while maintaining survey quality.<n>We evaluate the system using the DeLone and McLean IS Success Model to understand how generative AI can reshape a core IS method.
arXiv Detail & Related papers (2025-06-03T22:36:36Z)
An experimental survey and Perspective View on Meta-Learning for Automated Algorithms Selection and Parametrization [0.0]
We provide an overview of the state of the art in this continuously evolving field. AutoML makes machine learning techniques accessible to domain scientists who are interested in applying advanced analytics.
arXiv Detail & Related papers (2025-04-08T16:51:22Z)
SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing [13.101632066188532]
We introduce SurveyForge, which generates the outline by analyzing the logical structure of human-written outlines. To achieve a comprehensive evaluation, we construct SurveyBench, which includes 100 human-written survey papers for win-rate comparison. Experiments demonstrate that SurveyForge can outperform previous works such as AutoSurvey.
arXiv Detail & Related papers (2025-03-06T17:15:48Z)
Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations [49.908708778200115]
We are the first to specialize large language models (LLMs) for simulating survey response distributions.<n>As a testbed, we use country-level results from two global cultural surveys.<n>We devise a fine-tuning method based on first-token probabilities to minimize divergence between predicted and actual response distributions.
arXiv Detail & Related papers (2025-02-10T21:59:27Z)
AGENT-CQ: Automatic Generation and Evaluation of Clarifying Questions for Conversational Search with LLMs [53.6200736559742]
AGENT-CQ consists of two stages: a generation stage and an evaluation stage. CrowdLLM simulates human crowdsourcing judgments to assess generated questions and answers. Experiments on the ClariQ dataset demonstrate CrowdLLM's effectiveness in evaluating question and answer quality.
arXiv Detail & Related papers (2024-10-25T17:06:27Z)
An Automatic and Cost-Efficient Peer-Review Framework for Language Generation Evaluation [29.81362106367831]
Existing evaluation methods often suffer from high costs, limited test formats, the need of human references, and systematic evaluation biases. In contrast to previous studies that rely on human annotations, Auto-PRE selects evaluators automatically based on their inherent traits. Experimental results indicate our Auto-PRE achieves state-of-the-art performance at a lower cost.
arXiv Detail & Related papers (2024-10-16T06:06:06Z)
AutoSurvey: Large Language Models Can Automatically Write Surveys [77.0458309675818]
This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys. Traditional survey paper creation faces challenges due to the vast volume and complexity of information. Our contributions include a comprehensive solution to the survey problem, a reliable evaluation method, and experimental validation demonstrating AutoSurvey's effectiveness.
arXiv Detail & Related papers (2024-06-10T12:56:06Z)
Exploring Precision and Recall to assess the quality and diversity of LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral. This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv Detail & Related papers (2024-02-16T13:53:26Z)
Emerging Results on Automated Support for Searching and Selecting Evidence for Systematic Literature Review Updates [1.1153433121962064]
We present emerging results on an automated approach to support searching and selecting studies for SLR updates in Software Engineering. We developed an automated tool prototype to perform the snowballing search technique and support selecting relevant studies for SLR updates using Machine Learning (ML) algorithms.
arXiv Detail & Related papers (2024-02-07T23:39:20Z)
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models [72.57329554067195]
ProxyQA is an innovative framework dedicated to assessing longtext generation. It comprises in-depth human-curated meta-questions spanning various domains, each accompanied by specific proxy-questions with pre-annotated answers. It assesses the generated content's quality through the evaluator's accuracy in addressing the proxy-questions.
arXiv Detail & Related papers (2024-01-26T18:12:25Z)
CSMeD: Bridging the Dataset Gap in Automated Citation Screening for Systematic Literature Reviews [10.207938863784829]
We introduce CSMeD, a meta-dataset consolidating nine publicly released collections. CSMeD serves as a comprehensive resource for training and evaluating the performance of automated citation screening models. We introduce CSMeD-FT, a new dataset designed explicitly for evaluating the full text publication screening task.
arXiv Detail & Related papers (2023-11-21T09:36:11Z)
Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs) We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing. We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.