LatteReview: A Multi-Agent Framework for Systematic Review Automation Using Large Language Models
- URL: http://arxiv.org/abs/2501.05468v1
- Date: Sun, 05 Jan 2025 17:53:00 GMT
- Title: LatteReview: A Multi-Agent Framework for Systematic Review Automation Using Large Language Models
- Authors: Pouria Rouzrokh, Moein Shariatnia,
- Abstract summary: LatteReview is a Python-based framework that leverages large language models (LLMs) and multi-agent systems to automate key elements of the systematic review process.
The framework supports features such as Retrieval-Augmented Generation (RAG) for incorporating external context, multimodal reviews, Pydantic-based validation for structured inputs and outputs, and asynchronous programming for handling large-scale datasets.
- Score: 0.0
- License:
- Abstract: Systematic literature reviews and meta-analyses are essential for synthesizing research insights, but they remain time-intensive and labor-intensive due to the iterative processes of screening, evaluation, and data extraction. This paper introduces and evaluates LatteReview, a Python-based framework that leverages large language models (LLMs) and multi-agent systems to automate key elements of the systematic review process. Designed to streamline workflows while maintaining rigor, LatteReview utilizes modular agents for tasks such as title and abstract screening, relevance scoring, and structured data extraction. These agents operate within orchestrated workflows, supporting sequential and parallel review rounds, dynamic decision-making, and iterative refinement based on user feedback. LatteReview's architecture integrates LLM providers, enabling compatibility with both cloud-based and locally hosted models. The framework supports features such as Retrieval-Augmented Generation (RAG) for incorporating external context, multimodal reviews, Pydantic-based validation for structured inputs and outputs, and asynchronous programming for handling large-scale datasets. The framework is available on the GitHub repository, with detailed documentation and an installable package.
Related papers
- Dynamic benchmarking framework for LLM-based conversational data capture [0.0]
This paper introduces a benchmarking framework to assess large language models (LLMs)
It integrates generative agent simulation to evaluate performance on key dimensions: information extraction, context awareness, and adaptive engagement.
Results show that adaptive strategies improve data extraction accuracy, especially when handling ambiguous responses.
arXiv Detail & Related papers (2025-02-04T15:47:47Z) - Large Language Model Based Multi-Agent System Augmented Complex Event Processing Pipeline for Internet of Multimedia Things [8.729059187561761]
This paper presents the development and evaluation of a Large Language Model (LLM) based system framework for complex event processing (CEP)
The primary goal is to create a proof-of-concept that integrates state-of-the-art LLM orchestration frameworks with publish/subscribe (pub/sub) tools to address the integration of LLMs with current CEP systems.
arXiv Detail & Related papers (2025-01-01T17:38:40Z) - VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation [100.06122876025063]
This paper introduces VisDoMBench, the first comprehensive benchmark designed to evaluate QA systems in multi-document settings.
We propose VisDoMRAG, a novel multimodal Retrieval Augmented Generation (RAG) approach that simultaneously utilizes visual and textual RAG.
arXiv Detail & Related papers (2024-12-14T06:24:55Z) - SAGEval: The frontiers of Satisfactory Agent based NLG Evaluation for reference-free open-ended text [0.848663031844483]
This paper identifies the need to develop robust evaluation approaches for natural language generation, wherein references/ground labels doesn't exist or isn't amply available.
We show that the critiquing Agent is able to rectify scores from LLM evaluators, thereby reducing the need for labeled data even for complex NLG evaluation scenarios.
arXiv Detail & Related papers (2024-11-25T04:07:16Z) - BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data [61.936320820180875]
Large language models (LLMs) have become increasingly pivotal across various domains.
BabelBench is an innovative benchmark framework that evaluates the proficiency of LLMs in managing multimodal multistructured data with code execution.
Our experimental findings on BabelBench indicate that even cutting-edge models like ChatGPT 4 exhibit substantial room for improvement.
arXiv Detail & Related papers (2024-10-01T15:11:24Z) - The Compressor-Retriever Architecture for Language Model OS [20.56093501980724]
This paper explores the concept of using a language model as the core component of an operating system (OS)
A key challenge in realizing such an LM OS is managing the life-long context and ensuring statefulness across sessions.
We introduce compressor-retriever, a model-agnostic architecture designed for life-long context management.
arXiv Detail & Related papers (2024-09-02T23:28:15Z) - Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions [62.0123588983514]
Large Language Models (LLMs) have demonstrated wide-ranging applications across various fields.
We reformulate the peer-review process as a multi-turn, long-context dialogue, incorporating distinct roles for authors, reviewers, and decision makers.
We construct a comprehensive dataset containing over 26,841 papers with 92,017 reviews collected from multiple sources.
arXiv Detail & Related papers (2024-06-09T08:24:17Z) - CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.
Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)
We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z) - UltraEval: A Lightweight Platform for Flexible and Comprehensive Evaluation for LLMs [74.1976921342982]
This paper introduces UltraEval, a user-friendly evaluation framework characterized by its lightweight nature, comprehensiveness, modularity, and efficiency.
The resulting composability allows for the free combination of different models, tasks, prompts, benchmarks, and metrics within a unified evaluation workflow.
arXiv Detail & Related papers (2024-04-11T09:17:12Z) - An Integrated Data Processing Framework for Pretraining Foundation Models [57.47845148721817]
Researchers and practitioners often have to manually curate datasets from difference sources.
We propose a data processing framework that integrates a Processing Module and an Analyzing Module.
The proposed framework is easy to use and highly flexible.
arXiv Detail & Related papers (2024-02-26T07:22:51Z) - BIRCO: A Benchmark of Information Retrieval Tasks with Complex Objectives [2.3420045370973828]
We present the Benchmark of Information Retrieval (IR) tasks with Complex Objectives (BIRCO)
BIRCO evaluates the ability of IR systems to retrieve documents given multi-faceted user objectives.
arXiv Detail & Related papers (2024-02-21T22:22:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.