Scaling Systematic Literature Reviews with Machine Learning Pipelines
- URL: http://arxiv.org/abs/2010.04665v1
- Date: Fri, 9 Oct 2020 16:19:42 GMT
- Title: Scaling Systematic Literature Reviews with Machine Learning Pipelines
- Authors: Seraphina Goldfarb-Tarrant, Alexander Robertson, Jasmina Lazic,
Theodora Tsouloufi, Louise Donnison, Karen Smyth
- Abstract summary: Systematic reviews entail the extraction of data from scientific documents.
We construct a pipeline that automates each of these aspects, and experiment with many human-time vs. system quality trade-offs.
We find that we can get surprising accuracy and generalisability of the whole pipeline system with only 2 weeks of human-expert annotation.
- Score: 57.82662094602138
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Systematic reviews, which entail the extraction of data from large numbers of
scientific documents, are an ideal avenue for the application of machine
learning. They are vital to many fields of science and philanthropy, but are
very time-consuming and require experts. Yet the three main stages of a
systematic review are easily done automatically: searching for documents can be
done via APIs and scrapers, selection of relevant documents can be done via
binary classification, and extraction of data can be done via
sequence-labelling classification. Despite the promise of automation for this
field, little research exists that examines the various ways to automate each
of these tasks. We construct a pipeline that automates each of these aspects,
and experiment with many human-time vs. system quality trade-offs. We test the
ability of classifiers to work well on small amounts of data and to generalise
to data from countries not represented in the training data. We test different
types of data extraction with varying difficulty in annotation, and five
different neural architectures to do the extraction. We find that we can get
surprising accuracy and generalisability of the whole pipeline system with only
2 weeks of human-expert annotation, which is only 15% of the time it takes to
do the whole review manually and can be repeated and extended to new data with
no additional effort.
Related papers
- Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - A novel evaluation methodology for supervised Feature Ranking algorithms [0.0]
This paper proposes a new evaluation methodology for Feature Rankers.
By making use of synthetic datasets, feature importance scores can be known beforehand, allowing more systematic evaluation.
To facilitate large-scale experimentation using the new methodology, a benchmarking framework was built in Python, called fseval.
arXiv Detail & Related papers (2022-07-09T12:00:36Z) - Questions Are All You Need to Train a Dense Passage Retriever [123.13872383489172]
ART is a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data.
It uses a new document-retrieval autoencoding scheme, where (1) an input question is used to retrieve a set of evidence documents, and (2) the documents are then used to compute the probability of reconstructing the original question.
arXiv Detail & Related papers (2022-06-21T18:16:31Z) - Transfer Learning for Autonomous Chatter Detection in Machining [0.9281671380673306]
Large-amplitude chatter vibrations are one of the most important phenomena in machining processes.
Three challenges can be identified in applying machine learning for chatter detection at large in industry.
These three challenges can be grouped under the umbrella of transfer learning.
arXiv Detail & Related papers (2022-04-11T20:46:06Z) - Human-in-the-Loop Disinformation Detection: Stance, Sentiment, or
Something Else? [93.91375268580806]
Both politics and pandemics have recently provided ample motivation for the development of machine learning-enabled disinformation (a.k.a. fake news) detection algorithms.
Existing literature has focused primarily on the fully-automated case, but the resulting techniques cannot reliably detect disinformation on the varied topics, sources, and time scales required for military applications.
By leveraging an already-available analyst as a human-in-the-loop, canonical machine learning techniques of sentiment analysis, aspect-based sentiment analysis, and stance detection become plausible methods to use for a partially-automated disinformation detection system.
arXiv Detail & Related papers (2021-11-09T13:30:34Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Applications of Machine Learning in Document Digitisation [0.0]
We advocate the use of modern machine learning techniques to automate the digitisation process.
We give an overview of the potential for applying machine digitisation for data collection through two illustrative applications.
The first demonstrates that unsupervised layout classification applied to raw scans of nurse journals can be used to construct a treatment indicator.
The second application uses attention-based neural networks for handwritten text recognition in order to transcribe age and birth and death dates from a large collection of Danish death certificates.
arXiv Detail & Related papers (2021-02-05T15:35:28Z) - ORB: An Open Reading Benchmark for Comprehensive Evaluation of Machine
Reading Comprehension [53.037401638264235]
We present an evaluation server, ORB, that reports performance on seven diverse reading comprehension datasets.
The evaluation server places no restrictions on how models are trained, so it is a suitable test bed for exploring training paradigms and representation learning.
arXiv Detail & Related papers (2019-12-29T07:27:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.