Related papers: FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article

FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article

URL: http://arxiv.org/abs/2503.16561v1
Date: Thu, 20 Mar 2025 06:14:02 GMT
Title: FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article
Authors: Ibrahim Al Azher, Miftahul Jannat Mokarrama, Zhishuai Guo, Sagnik Ray Choudhury, Hamed Alhoori,
Abstract summary: This study generates future work suggestions from key sections of a scientific article alongside related papers.<n>We experimented with various Large Language Models (LLMs) and integrated Retrieval-Augmented Generation (RAG) to enhance the generation process.
Score: 6.682911432177815
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The future work section of a scientific article outlines potential research directions by identifying gaps and limitations of a current study. This section serves as a valuable resource for early-career researchers seeking unexplored areas and experienced researchers looking for new projects or collaborations. In this study, we generate future work suggestions from key sections of a scientific article alongside related papers and analyze how the trends have evolved. We experimented with various Large Language Models (LLMs) and integrated Retrieval-Augmented Generation (RAG) to enhance the generation process. We incorporate a LLM feedback mechanism to improve the quality of the generated content and propose an LLM-as-a-judge approach for evaluation. Our results demonstrated that the RAG-based approach with LLM feedback outperforms other methods evaluated through qualitative and quantitative metrics. Moreover, we conduct a human evaluation to assess the LLM as an extractor and judge. The code and dataset for this project are here, code: HuggingFace

Related papers

From Code to Courtroom: LLMs as the New Software Judges [29.77858458399232]
Large Language Models (LLMs) have been increasingly used to automate software engineering tasks such as code generation and summarization.<n>Human evaluation, while effective, is very costly and time-consuming.<n>The LLM-as-a-Judge paradigm, which employs LLMs for automated evaluation, has emerged.
arXiv Detail & Related papers (2025-03-04T03:48:23Z)
LLM4SR: A Survey on Large Language Models for Scientific Research [15.533076347375207]
Large Language Models (LLMs) offer unprecedented support across various stages of the research cycle. This paper presents the first systematic survey dedicated to exploring how LLMs are revolutionizing the scientific research process.
arXiv Detail & Related papers (2025-01-08T06:44:02Z)
MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs [97.94579295913606]
Multimodal Large Language Models (MLLMs) have garnered increased attention from both industry and academia. In the development process, evaluation is critical since it provides intuitive feedback and guidance on improving models. This work aims to offer researchers an easy grasp of how to effectively evaluate MLLMs according to different needs and to inspire better evaluation methods.
arXiv Detail & Related papers (2024-11-22T18:59:54Z)
IdeaBench: Benchmarking Large Language Models for Research Idea Generation [19.66218274796796]
Large Language Models (LLMs) have transformed how people interact with artificial intelligence (AI) systems. We propose IdeaBench, a benchmark system that includes a comprehensive dataset and an evaluation framework. Our dataset comprises titles and abstracts from a diverse range of influential papers, along with their referenced works. Our evaluation framework is a two-stage process: first, using GPT-4o to rank ideas based on user-specified quality indicators such as novelty and feasibility, enabling scalable personalization.
arXiv Detail & Related papers (2024-10-31T17:04:59Z)
Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents [64.64280477958283]
An exponential increase in scientific literature makes it challenging for researchers to stay current with recent advances and identify meaningful research directions. Recent developments in large language models(LLMs) suggest a promising avenue for automating the generation of novel research ideas. We propose a Chain-of-Ideas(CoI) agent, an LLM-based agent that organizes relevant literature in a chain structure to effectively mirror the progressive development in a research domain.
arXiv Detail & Related papers (2024-10-17T03:26:37Z)
HumanEvo: An Evolution-aware Benchmark for More Realistic Evaluation of Repository-level Code Generation [36.1669124651617]
We conduct an empirical study to understand Large Language Models' code generation performance within settings that reflect the evolution nature of software development.<n>We use an evolution-aware repository-level code generation dataset, namely HumanEvo, equipped with an automated execution-based evaluation tool.<n>We find that previous evolution-ignored evaluation methods result in inflated performance of LLMs, with performance overestimations ranging from 10.0% to 61.1%.
arXiv Detail & Related papers (2024-06-11T03:19:18Z)
DnA-Eval: Enhancing Large Language Model Evaluation through Decomposition and Aggregation [75.81096662788254]
Large Language Models (LLMs) are scalable and economical evaluators. The question of how reliable these evaluators are has emerged as a crucial research question. We propose Decompose and Aggregate, which breaks down the evaluation process into different stages based on pedagogical practices.
arXiv Detail & Related papers (2024-05-24T08:12:30Z)
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is an AI-based system for ideation and operationalization of novel work.<n>ResearchAgent automatically defines novel problems, proposes methods and designs experiments, while iteratively refining them.<n>We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z)
Aligning Large Language Models with Human: A Survey [53.6014921995006]
Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect information. This survey presents a comprehensive overview of these alignment technologies, including the following aspects.
arXiv Detail & Related papers (2023-07-24T17:44:58Z)
A Comprehensive Overview of Large Language Models [68.22178313875618]
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks. This article provides an overview of the existing literature on a broad range of LLM-related concepts.
arXiv Detail & Related papers (2023-07-12T20:01:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.