Modeling Disagreement in Automatic Data Labelling for Semi-Supervised
Learning in Clinical Natural Language Processing
- URL: http://arxiv.org/abs/2205.14761v1
- Date: Sun, 29 May 2022 20:20:49 GMT
- Title: Modeling Disagreement in Automatic Data Labelling for Semi-Supervised
Learning in Clinical Natural Language Processing
- Authors: Hongshu Liu, Nabeel Seedat, Julia Ive
- Abstract summary: We investigate the quality of uncertainty estimates from a range of current state-of-the-art predictive models applied to the problem of observation detection in radiology reports.
- Score: 2.016042047576802
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computational models providing accurate estimates of their uncertainty are
crucial for risk management associated with decision making in healthcare
contexts. This is especially true since many state-of-the-art systems are
trained using the data which has been labelled automatically (self-supervised
mode) and tend to overfit. In this work, we investigate the quality of
uncertainty estimates from a range of current state-of-the-art predictive
models applied to the problem of observation detection in radiology reports.
This problem remains understudied for Natural Language Processing in the
healthcare domain. We demonstrate that Gaussian Processes (GPs) provide
superior performance in quantifying the risks of 3 uncertainty labels based on
the negative log predictive probability (NLPP) evaluation metric and mean
maximum predicted confidence levels (MMPCL), whilst retaining strong predictive
performance.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.