Related papers: CHIME: LLM-Assisted Hierarchical Organization of Scientific Studies for Literature Review Support

CHIME: LLM-Assisted Hierarchical Organization of Scientific Studies for Literature Review Support

URL: http://arxiv.org/abs/2407.16148v1
Date: Tue, 23 Jul 2024 03:18:00 GMT
Title: CHIME: LLM-Assisted Hierarchical Organization of Scientific Studies for Literature Review Support
Authors: Chao-Chun Hsu, Erin Bransom, Jenna Sparks, Bailey Kuehl, Chenhao Tan, David Wadden, Lucy Lu Wang, Aakanksha Naik,
Abstract summary: Literature review requires researchers to synthesize a large amount of information and is increasingly challenging as the scientific literature expands. In this work, we investigate the potential of LLMs for producing hierarchical organizations of scientific studies to assist researchers with literature review.
Score: 31.327873791724326
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Literature review requires researchers to synthesize a large amount of information and is increasingly challenging as the scientific literature expands. In this work, we investigate the potential of LLMs for producing hierarchical organizations of scientific studies to assist researchers with literature review. We define hierarchical organizations as tree structures where nodes refer to topical categories and every node is linked to the studies assigned to that category. Our naive LLM-based pipeline for hierarchy generation from a set of studies produces promising yet imperfect hierarchies, motivating us to collect CHIME, an expert-curated dataset for this task focused on biomedicine. Given the challenging and time-consuming nature of building hierarchies from scratch, we use a human-in-the-loop process in which experts correct errors (both links between categories and study assignment) in LLM-generated hierarchies. CHIME contains 2,174 LLM-generated hierarchies covering 472 topics, and expert-corrected hierarchies for a subset of 100 topics. Expert corrections allow us to quantify LLM performance, and we find that while they are quite good at generating and organizing categories, their assignment of studies to categories could be improved. We attempt to train a corrector model with human feedback which improves study assignment by 12.6 F1 points. We release our dataset and models to encourage research on developing better assistive tools for literature review.

Related papers

Science Hierarchography: Hierarchical Organization of Science Literature [20.182213614072836]
We motivate SCIENCE HARCHOGRAPHY, the goal of organizing scientific literature into a high-quality hierarchical structure. We develop a range of algorithms to achieve the goals of SCIENCE HIERARCHOGRAPHY. Results show that this structured approach enhances interpretability, supports trend discovery, and offers an alternative pathway for exploring scientific literature beyond traditional search methods.
arXiv Detail & Related papers (2025-04-18T17:59:29Z)
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition [67.26124739345332]
Large language models (LLMs) have demonstrated potential in assisting scientific research, yet their ability to discover high-quality research hypotheses remains unexamined. We introduce the first large-scale benchmark for evaluating LLMs with a near-sufficient set of sub-tasks of scientific discovery. We develop an automated framework that extracts critical components - research questions, background surveys, inspirations, and hypotheses - from scientific papers.
arXiv Detail & Related papers (2025-03-27T08:09:15Z)
Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents [64.64280477958283]
An exponential increase in scientific literature makes it challenging for researchers to stay current with recent advances and identify meaningful research directions. Recent developments in large language models(LLMs) suggest a promising avenue for automating the generation of novel research ideas. We propose a Chain-of-Ideas(CoI) agent, an LLM-based agent that organizes relevant literature in a chain structure to effectively mirror the progressive development in a research domain.
arXiv Detail & Related papers (2024-10-17T03:26:37Z)
Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored. We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches. We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z)
HiReview: Hierarchical Taxonomy-Driven Automatic Literature Review Generation [15.188580557890942]
HiReview is a novel framework for hierarchical taxonomy-driven automatic literature review generation. Extensive experiments demonstrate that HiReview significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-10-02T13:02:03Z)
LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing [106.45895712717612]
Large language models (LLMs) have shown remarkable versatility in various generative tasks. This study focuses on the topic of LLMs assist NLP Researchers. To our knowledge, this is the first work to provide such a comprehensive analysis.
arXiv Detail & Related papers (2024-06-24T01:30:22Z)
SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation [50.26966969163348]
Large Language Models (LLMs) have shown great potential in the biomedical domain with the advancement of retrieval-augmented generation (RAG) Existing retrieval-augmented approaches face challenges in addressing diverse queries and documents, particularly for medical knowledge queries. We propose Self-Rewarding Tree Search (SeRTS) based on Monte Carlo Tree Search (MCTS) and a self-rewarding paradigm.
arXiv Detail & Related papers (2024-06-17T06:48:31Z)
ResearchArena: Benchmarking Large Language Models' Ability to Collect and Organize Information as Research Agents [21.17856299966841]
This study introduces ResearchArena, a benchmark designed to evaluate large language models (LLMs) in conducting academic surveys. To support these opportunities, we construct an environment of 12M full-text academic papers and 7.9K survey papers.
arXiv Detail & Related papers (2024-06-13T03:26:30Z)
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature [80.49349719239584]
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks. SciRIFF is the first dataset focused on extracting and synthesizing information from research literature across a wide range of scientific fields.
arXiv Detail & Related papers (2024-06-10T21:22:08Z)
Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph [18.41743815836192]
We propose using Large Language Models (LLMs) to automatically suggest properties for structured science summaries. Our study performs a comprehensive comparative analysis between ORKG's manually curated properties and those generated by the aforementioned state-of-the-art LLMs. Overall, LLMs show potential as recommendation systems for structuring science, but further finetuning is recommended to improve their alignment with scientific tasks and mimicry of human expertise.
arXiv Detail & Related papers (2024-05-03T14:03:04Z)
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is an AI-based system for ideation and operationalization of novel work. ResearchAgent automatically defines novel problems, proposes methods and designs experiments, while iteratively refining them. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z)
A Bibliometric Review of Large Language Models Research from 2017 to 2023 [1.4190701053683017]
Large language models (LLMs) are language models that have demonstrated outstanding performance across a range of natural language processing (NLP) tasks. This paper serves as a roadmap for researchers, practitioners, and policymakers to navigate the current landscape of LLMs research.
arXiv Detail & Related papers (2023-04-03T21:46:41Z)
Provable Hierarchy-Based Meta-Reinforcement Learning [50.17896588738377]
We analyze HRL in the meta-RL setting, where learner learns latent hierarchical structure during meta-training for use in a downstream task. We provide "diversity conditions" which, together with a tractable optimism-based algorithm, guarantee sample-efficient recovery of this natural hierarchy. Our bounds incorporate common notions in HRL literature such as temporal and state/action abstractions, suggesting that our setting and analysis capture important features of HRL in practice.
arXiv Detail & Related papers (2021-10-18T17:56:02Z)
COVID-19 Literature Topic-Based Search via Hierarchical NMF [29.04869940568828]
A dataset of COVID-19-related scientific literature is compiled. hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure.
arXiv Detail & Related papers (2020-09-07T05:45:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.