Multi-Scored Sleep Databases: How to Exploit the Multiple-Labels in
Automated Sleep Scoring
- URL: http://arxiv.org/abs/2207.01910v2
- Date: Wed, 6 Jul 2022 08:24:39 GMT
- Title: Multi-Scored Sleep Databases: How to Exploit the Multiple-Labels in
Automated Sleep Scoring
- Authors: Luigi Fiorillo, Davide Pedroncelli, Valentina Agostini, Paolo Favaro,
Francesca Dalia Faraci
- Abstract summary: We exploit the label smoothing technique together with a soft-consensus distribution to insert the multiple-knowledge in the training procedure of the model.
We introduce the averaged cosine similarity metric to quantify the similarity between the hypnodensity-graph generated by the models with-LSSC and the hypnodensity-graph generated by the scorer consensus.
- Score: 19.24428734909019
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Study Objectives: Inter-scorer variability in scoring polysomnograms is a
well-known problem. Most of the existing automated sleep scoring systems are
trained using labels annotated by a single scorer, whose subjective evaluation
is transferred to the model. When annotations from two or more scorers are
available, the scoring models are usually trained on the scorer consensus. The
averaged scorer's subjectivity is transferred into the model, losing
information about the internal variability among different scorers. In this
study, we aim to insert the multiple-knowledge of the different physicians into
the training procedure.The goal is to optimize a model training, exploiting the
full information that can be extracted from the consensus of a group of
scorers.
Methods: We train two lightweight deep learning based models on three
different multi-scored databases. We exploit the label smoothing technique
together with a soft-consensus (LSSC) distribution to insert the
multiple-knowledge in the training procedure of the model. We introduce the
averaged cosine similarity metric (ACS) to quantify the similarity between the
hypnodensity-graph generated by the models with-LSSC and the hypnodensity-graph
generated by the scorer consensus.
Results: The performance of the models improves on all the databases when we
train the models with our LSSC. We found an increase in ACS (up to 6.4%)
between the hypnodensity-graph generated by the models trained with-LSSC and
the hypnodensity-graph generated by the consensus.
Conclusions: Our approach definitely enables a model to better adapt to the
consensus of the group of scorers. Future work will focus on further
investigations on different scoring architectures.
Related papers
- Deep Unlearn: Benchmarking Machine Unlearning [7.450700594277741]
Machine unlearning (MU) aims to remove the influence of particular data points from the learnable parameters of a trained machine learning model.
This paper investigates 18 state-of-the-art MU methods across various benchmark datasets and models.
arXiv Detail & Related papers (2024-10-02T06:41:58Z) - Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions [2.277447144331876]
We investigate a collection of models that account for the individual preferences and tendencies of each human scorer in the automated scoring task.
We conduct quantitative experiments and case studies to analyze the individual preferences and tendencies of scorers.
arXiv Detail & Related papers (2023-06-01T15:22:05Z) - Towards Similarity-Aware Time-Series Classification [51.2400839966489]
We study time-series classification (TSC), a fundamental task of time-series data mining.
We propose Similarity-Aware Time-Series Classification (SimTSC), a framework that models similarity information with graph neural networks (GNNs)
arXiv Detail & Related papers (2022-01-05T02:14:57Z) - Resilience from Diversity: Population-based approach to harden models
against adversarial attacks [0.0]
This work introduces a model that is resilient to adversarial attacks.
Our model leverages a well established principle from biological sciences: population diversity produces resilience against environmental changes.
A Counter-Linked Model (CLM) consists of submodels of the same architecture where a periodic random similarity examination is conducted.
arXiv Detail & Related papers (2021-11-19T15:22:21Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Evaluation Toolkit For Robustness Testing Of Automatic Essay Scoring
Systems [64.4896118325552]
We evaluate the current state-of-the-art AES models using a model adversarial evaluation scheme and associated metrics.
We find that AES models are highly overstable. Even heavy modifications(as much as 25%) with content unrelated to the topic of the questions do not decrease the score produced by the models.
arXiv Detail & Related papers (2020-07-14T03:49:43Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z) - AvgOut: A Simple Output-Probability Measure to Eliminate Dull Responses [97.50616524350123]
We build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering.
The first model, MinAvgOut, directly maximizes the diversity score through the output distributions of each batch.
The second model, Label Fine-Tuning (LFT), prepends to the source sequence a label continuously scaled by the diversity score to control the diversity level.
The third model, RL, adopts Reinforcement Learning and treats the diversity score as a reward signal.
arXiv Detail & Related papers (2020-01-15T18:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.