AiReview: An Open Platform for Accelerating Systematic Reviews with LLMs
- URL: http://arxiv.org/abs/2504.04193v1
- Date: Sat, 05 Apr 2025 14:55:43 GMT
- Title: AiReview: An Open Platform for Accelerating Systematic Reviews with LLMs
- Authors: Xinyu Mao, Teerapong Leelanupab, Martin Potthast, Harrisen Scells, Guido Zuccon,
- Abstract summary: AiReview is a novel platform for LLM-assisted systematic review creation.<n>It is the first of its kind to bridge the gap between cutting-edge LLM-assisted screening methods and those that create medical systematic reviews.
- Score: 43.1999161587789
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Systematic reviews are fundamental to evidence-based medicine. Creating one is time-consuming and labour-intensive, mainly due to the need to screen, or assess, many studies for inclusion in the review. Several tools have been developed to streamline this process, mostly relying on traditional machine learning methods. Large language models (LLMs) have shown potential in further accelerating the screening process. However, no tool currently allows end users to directly leverage LLMs for screening or facilitates systematic and transparent usage of LLM-assisted screening methods. This paper introduces (i) an extensible framework for applying LLMs to systematic review tasks, particularly title and abstract screening, and (ii) a web-based interface for LLM-assisted screening. Together, these elements form AiReview-a novel platform for LLM-assisted systematic review creation. AiReview is the first of its kind to bridge the gap between cutting-edge LLM-assisted screening methods and those that create medical systematic reviews. The tool is available at https://aireview.ielab.io. The source code is also open sourced at https://github.com/ielab/ai-review.
Related papers
- Rankers, Judges, and Assistants: Towards Understanding the Interplay of LLMs in Information Retrieval Evaluation [44.58099275559231]
Large language models (LLMs) are increasingly integral to information retrieval (IR), powering ranking, evaluation, and AI-assisted content creation.
This paper synthesizes existing research and presents novel experiment designs that explore how LLM-based rankers and assistants influence LLM-based judges.
arXiv Detail & Related papers (2025-03-24T19:24:40Z) - LitLLMs, LLMs for Literature Review: Are we there yet? [15.785989492351684]
This paper explores the zero-shot abilities of recent Large Language Models in assisting with the writing of literature reviews based on an abstract.<n>For retrieval, we introduce a novel two-step search strategy that first uses an LLM to extract meaningful keywords from the abstract of a paper.<n>In the generation phase, we propose a two-step approach that first outlines a plan for the review and then executes steps in the plan to generate the actual review.
arXiv Detail & Related papers (2024-12-15T01:12:26Z) - Aligning Language Models with Demonstrated Feedback [58.834937450242975]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors.
We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z) - Get my drift? Catching LLM Task Drift with Activation Deltas [55.75645403965326]
Task drift allows attackers to exfiltrate data or influence the LLM's output for other users.<n>We show that a simple linear classifier can detect drift with near-perfect ROC AUC on an out-of-distribution test set.<n>We observe that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions.
arXiv Detail & Related papers (2024-06-02T16:53:21Z) - The Promise and Challenges of Using LLMs to Accelerate the Screening Process of Systematic Reviews [7.030989629685138]
Large Language Models (LLMs) can accelerate title-abstract screening by simplifying abstracts for human screeners.
We performed an experiment where humans screened titles and abstracts for 20 papers with both original and simplified abstracts from a prior Systematic Review.
We also studied if different prompting techniques (Zero-shot (ZS), One-shot (OS), Few-shot (FS), and Few-shot with Chain-of-Thought (FS-CoT)) improve the screening performance of LLMs.
arXiv Detail & Related papers (2024-04-24T05:53:20Z) - ST-LLM: Large Language Models Are Effective Temporal Learners [58.79456373423189]
Large Language Models (LLMs) have showcased impressive capabilities in text comprehension and generation.
How to effectively encode and understand videos in video-based dialogue systems remains to be solved.
We propose ST-LLM, an effective video-LLM baseline with spatial-temporal sequence modeling inside LLM.
arXiv Detail & Related papers (2024-03-30T10:11:26Z) - LitLLM: A Toolkit for Scientific Literature Review [15.785989492351684]
We propose a toolkit that operates on Retrieval Augmented Generation (RAG) principles.<n>Our system first initiates a web search to retrieve relevant papers.<n>Second, the system re-ranks the retrieved papers based on the user-provided abstract.<n>Third, the related work section is generated based on the re-ranked results and the abstract.
arXiv Detail & Related papers (2024-02-02T02:41:28Z) - Video Understanding with Large Language Models: A Survey [97.29126722004949]
Given the remarkable capabilities of large language models (LLMs) in language and multimodal tasks, this survey provides a detailed overview of recent advancements in video understanding.
The emergent capabilities Vid-LLMs are surprisingly advanced, particularly their ability for open-ended multi-granularity reasoning.
This survey presents a comprehensive study of the tasks, datasets, benchmarks, and evaluation methodologies for Vid-LLMs.
arXiv Detail & Related papers (2023-12-29T01:56:17Z) - LM-Polygraph: Uncertainty Estimation for Language Models [71.21409522341482]
Uncertainty estimation (UE) methods are one path to safer, more responsible, and more effective use of large language models (LLMs)
We introduce LM-Polygraph, a framework with implementations of a battery of state-of-the-art UE methods for LLMs in text generation tasks, with unified program interfaces in Python.
It introduces an extendable benchmark for consistent evaluation of UE techniques by researchers, and a demo web application that enriches the standard chat dialog with confidence scores.
arXiv Detail & Related papers (2023-11-13T15:08:59Z) - Calculating Originality of LLM Assisted Source Code [0.0]
We propose a neural network-based tool to determine the original effort (and LLM's contribution) put by students in writing source codes.
Our tool is motivated by minimum description length measures like Kolmogorov complexity.
arXiv Detail & Related papers (2023-07-10T11:30:46Z) - Low-code LLM: Graphical User Interface over Large Language Models [115.08718239772107]
This paper introduces a novel human-LLM interaction framework, Low-code LLM.
It incorporates six types of simple low-code visual programming interactions to achieve more controllable and stable responses.
We highlight three advantages of the low-code LLM: user-friendly interaction, controllable generation, and wide applicability.
arXiv Detail & Related papers (2023-04-17T09:27:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.