System for systematic literature review using multiple AI agents:
Concept and an empirical evaluation
- URL: http://arxiv.org/abs/2403.08399v1
- Date: Wed, 13 Mar 2024 10:27:52 GMT
- Title: System for systematic literature review using multiple AI agents:
Concept and an empirical evaluation
- Authors: Abdul Malik Sami, Zeeshan Rasheed, Kai-Kristian Kemell, Muhammad
Waseem, Terhi Kilamo, Mika Saari, Anh Nguyen Duc, Kari Syst\"a, Pekka
Abrahamsson
- Abstract summary: We introduce a novel multi-AI agent model designed to fully automate the process of conducting Systematic Literature Reviews.
The model operates through a user-friendly interface where researchers input their topic.
It generates a search string used to retrieve relevant academic papers.
The model then autonomously summarizes the abstracts of these papers.
- Score: 5.194208843843004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Systematic Literature Reviews (SLRs) have become the foundation of
evidence-based studies, enabling researchers to identify, classify, and combine
existing studies based on specific research questions. Conducting an SLR is
largely a manual process. Over the previous years, researchers have made
significant progress in automating certain phases of the SLR process, aiming to
reduce the effort and time needed to carry out high-quality SLRs. However,
there is still a lack of AI agent-based models that automate the entire SLR
process. To this end, we introduce a novel multi-AI agent model designed to
fully automate the process of conducting an SLR. By utilizing the capabilities
of Large Language Models (LLMs), our proposed model streamlines the review
process, enhancing efficiency and accuracy. The model operates through a
user-friendly interface where researchers input their topic, and in response,
the model generates a search string used to retrieve relevant academic papers.
Subsequently, an inclusive and exclusive filtering process is applied, focusing
on titles relevant to the specific research area. The model then autonomously
summarizes the abstracts of these papers, retaining only those directly related
to the field of study. In the final phase, the model conducts a thorough
analysis of the selected papers concerning predefined research questions. We
also evaluated the proposed model by sharing it with ten competent software
engineering researchers for testing and analysis. The researchers expressed
strong satisfaction with the proposed model and provided feedback for further
improvement. The code for this project can be found on the GitHub repository at
https://github.com/GPT-Laboratory/SLR-automation.
Related papers
- CycleResearcher: Improving Automated Research via Automated Review [37.03497673861402]
This paper explores the possibility of using open-source post-trained large language models (LLMs) as autonomous agents capable of performing the full cycle of automated research and review.
To train these models, we develop two new datasets, reflecting real-world machine learning research and peer review dynamics.
In research, the papers generated by the CycleResearcher model achieved a score of 5.36 in simulated peer reviews, surpassing the preprint level of 5.24 from human experts and approaching the accepted paper level of 5.69.
arXiv Detail & Related papers (2024-10-28T08:10:21Z) - From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models [56.9134620424985]
Cross-modal reasoning (CMR) is increasingly recognized as a crucial capability in the progression toward more sophisticated artificial intelligence systems.
The recent trend of deploying Large Language Models (LLMs) to tackle CMR tasks has marked a new mainstream of approaches for enhancing their effectiveness.
This survey offers a nuanced exposition of current methodologies applied in CMR using LLMs, classifying these into a detailed three-tiered taxonomy.
arXiv Detail & Related papers (2024-09-19T02:51:54Z) - Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community.
There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z) - Retrieval-Enhanced Machine Learning: Synthesis and Opportunities [60.34182805429511]
Retrieval-enhancement can be extended to a broader spectrum of machine learning (ML)
This work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature.
The goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.
arXiv Detail & Related papers (2024-07-17T20:01:21Z) - Automatic benchmarking of large multimodal models via iterative experiment programming [71.78089106671581]
We present APEx, the first framework for automatic benchmarking of LMMs.
Given a research question expressed in natural language, APEx leverages a large language model (LLM) and a library of pre-specified tools to generate a set of experiments for the model at hand.
The report drives the testing procedure: based on the current status of the investigation, APEx chooses which experiments to perform and whether the results are sufficient to draw conclusions.
arXiv Detail & Related papers (2024-06-18T06:43:46Z) - RelevAI-Reviewer: A Benchmark on AI Reviewers for Survey Paper Relevance [0.8089605035945486]
We propose RelevAI-Reviewer, an automatic system that conceptualizes the task of survey paper review as a classification problem.
We introduce a novel dataset comprised of 25,164 instances. Each instance contains one prompt and four candidate papers, each varying in relevance to the prompt.
We develop a machine learning (ML) model capable of determining the relevance of each paper and identifying the most pertinent one.
arXiv Detail & Related papers (2024-06-13T06:42:32Z) - Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning [0.9110413356918055]
This research pioneers the use of fine-tuned Large Language Models (LLMs) to automate Systematic Literature Reviews ( SLRs)
Our study employed the latest fine-tuning methodologies together with open-sourced LLMs, and demonstrated a practical and efficient approach to automating the final execution stages of an SLR process.
The results maintained high fidelity in factual accuracy in LLM responses, and were validated through the replication of an existing PRISMA-conforming SLR.
arXiv Detail & Related papers (2024-04-08T00:08:29Z) - Artificial Intelligence for Literature Reviews: Opportunities and Challenges [0.0]
This manuscript presents a comprehensive review of the use of Artificial Intelligence in Systematic Literature Reviews.
A SLR is a rigorous and organised methodology that assesses and integrates previous research on a given topic.
We examine 21 leading SLR tools using a framework that combines 23 traditional features with 11 AI features.
arXiv Detail & Related papers (2024-02-13T16:05:51Z) - Emerging Results on Automated Support for Searching and Selecting
Evidence for Systematic Literature Review Updates [1.1153433121962064]
We present emerging results on an automated approach to support searching and selecting studies for SLR updates in Software Engineering.
We developed an automated tool prototype to perform the snowballing search technique and support selecting relevant studies for SLR updates using Machine Learning (ML) algorithms.
arXiv Detail & Related papers (2024-02-07T23:39:20Z) - Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.