Efficient Systematic Reviews: Literature Filtering with Transformers & Transfer Learning
- URL: http://arxiv.org/abs/2405.20354v2
- Date: Thu, 10 Oct 2024 23:20:34 GMT
- Title: Efficient Systematic Reviews: Literature Filtering with Transformers & Transfer Learning
- Authors: John Hawkins, David Tivey,
- Abstract summary: It comes with an increasing burden in terms of the time required to identify the important articles of research for a given topic.
We develop a method for building a general-purpose filtering system that matches a research question, posed as a natural language description of the required content.
Our results demonstrate that transformer models, pre-trained on biomedical literature, and then fine tuned for the specific task, offer a promising solution to this problem.
- Score: 0.0
- License:
- Abstract: Identifying critical research within the growing body of academic work is an intrinsic aspect of conducting quality research. Systematic review processes used in evidence-based medicine formalise this as a procedure that must be followed in a research program. However, it comes with an increasing burden in terms of the time required to identify the important articles of research for a given topic. In this work, we develop a method for building a general-purpose filtering system that matches a research question, posed as a natural language description of the required content, against a candidate set of articles obtained via the application of broad search terms. Our results demonstrate that transformer models, pre-trained on biomedical literature, and then fine tuned for the specific task, offer a promising solution to this problem. The model can remove large volumes of irrelevant articles for most research questions. Furthermore, analysis of the specific research questions in our training data suggest natural avenues for further improvement.
Related papers
- Automating Bibliometric Analysis with Sentence Transformers and Retrieval-Augmented Generation (RAG): A Pilot Study in Semantic and Contextual Search for Customized Literature Characterization for High-Impact Urban Research [2.1728621449144763]
Bibliometric analysis is essential for understanding research trends, scope, and impact in urban science.
Traditional methods, relying on keyword searches, often fail to uncover valuable insights not explicitly stated in article titles or keywords.
We leverage Generative AI models, specifically transformers and Retrieval-Augmented Generation (RAG), to automate and enhance bibliometric analysis.
arXiv Detail & Related papers (2024-10-08T05:13:27Z) - LLAssist: Simple Tools for Automating Literature Review Using Large Language Models [0.0]
LLAssist is an open-source tool designed to streamline literature reviews in academic research.
It uses Large Language Models (LLMs) and Natural Language Processing (NLP) techniques to automate key aspects of the review process.
arXiv Detail & Related papers (2024-07-19T02:48:54Z) - ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is a large language model-powered research idea writing agent.
It generates problems, methods, and experiment designs while iteratively refining them based on scientific literature.
We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z) - SurveyAgent: A Conversational System for Personalized and Efficient Research Survey [50.04283471107001]
This paper introduces SurveyAgent, a novel conversational system designed to provide personalized and efficient research survey assistance to researchers.
SurveyAgent integrates three key modules: Knowledge Management for organizing papers, Recommendation for discovering relevant literature, and Query Answering for engaging with content on a deeper level.
Our evaluation demonstrates SurveyAgent's effectiveness in streamlining research activities, showcasing its capability to facilitate how researchers interact with scientific literature.
arXiv Detail & Related papers (2024-04-09T15:01:51Z) - System for systematic literature review using multiple AI agents:
Concept and an empirical evaluation [5.194208843843004]
We introduce a novel multi-AI agent model designed to fully automate the process of conducting Systematic Literature Reviews.
The model operates through a user-friendly interface where researchers input their topic.
It generates a search string used to retrieve relevant academic papers.
The model then autonomously summarizes the abstracts of these papers.
arXiv Detail & Related papers (2024-03-13T10:27:52Z) - An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks.
These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems.
Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z) - De-identification of clinical free text using natural language
processing: A systematic review of current approaches [48.343430343213896]
Natural language processing has repeatedly demonstrated its feasibility in automating the de-identification process.
Our study aims to provide systematic evidence on how the de-identification of clinical free text has evolved in the last thirteen years.
arXiv Detail & Related papers (2023-11-28T13:20:41Z) - Application of Transformers based methods in Electronic Medical Records:
A Systematic Literature Review [77.34726150561087]
This work presents a systematic literature review of state-of-the-art advances using transformer-based methods on electronic medical records (EMRs) in different NLP tasks.
arXiv Detail & Related papers (2023-04-05T22:19:42Z) - Research Trends and Applications of Data Augmentation Algorithms [77.34726150561087]
We identify the main areas of application of data augmentation algorithms, the types of algorithms used, significant research trends, their progression over time and research gaps in data augmentation literature.
We expect readers to understand the potential of data augmentation, as well as identify future research directions and open questions within data augmentation research.
arXiv Detail & Related papers (2022-07-18T11:38:32Z) - Best Practices and Scoring System on Reviewing A.I. based Medical
Imaging Papers: Part 1 Classification [0.9428556282541211]
The Machine Learning Education Sub-Committee of SIIM has identified a knowledge gap and a serious need to establish guidelines for reviewing these studies.
This first entry in the series focuses on the task of image classification.
The goal of this series is to provide resources to help improve the review process for A.I.-based medical imaging papers.
arXiv Detail & Related papers (2022-02-03T21:46:59Z) - The CSO Classifier: Ontology-Driven Detection of Research Topics in
Scholarly Articles [0.0]
We present a new unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO)
The CSO takes as input the metadata associated with a research paper (title, abstract, keywords) and returns a selection of research concepts drawn from the ontology.
The approach was evaluated on a gold standard of manually annotated articles yielding a significant improvement over alternative methods.
arXiv Detail & Related papers (2021-04-02T09:02:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.