Active Transfer Prototypical Network: An Efficient Labeling Algorithm
for Time-Series Data
- URL: http://arxiv.org/abs/2209.14199v1
- Date: Wed, 28 Sep 2022 16:14:40 GMT
- Title: Active Transfer Prototypical Network: An Efficient Labeling Algorithm
for Time-Series Data
- Authors: Yuqicheng Zhu, Mohamed-Ali Tnani, Timo Jahnz, Klaus Diepold
- Abstract summary: This paper proposes a novel Few-Shot Learning (FSL)-based AL framework, which addresses the trade-off problem by incorporating a Prototypical Network (ProtoNet) in the AL iterations.
This framework was validated on UCI HAR/HAPT dataset and a real-world braking maneuver dataset.
The learning performance significantly surpasses traditional AL algorithms on both datasets, achieving 90% classification accuracy with 10% and 5% labeling effort, respectively.
- Score: 1.7205106391379026
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The paucity of labeled data is a typical challenge in the automotive
industry. Annotating time-series measurements requires solid domain knowledge
and in-depth exploratory data analysis, which implies a high labeling effort.
Conventional Active Learning (AL) addresses this issue by actively querying the
most informative instances based on the estimated classification probability
and retraining the model iteratively. However, the learning efficiency strongly
relies on the initial model, resulting in the trade-off between the size of the
initial dataset and the query number. This paper proposes a novel Few-Shot
Learning (FSL)-based AL framework, which addresses the trade-off problem by
incorporating a Prototypical Network (ProtoNet) in the AL iterations. The
results show an improvement, on the one hand, in the robustness to the initial
model and, on the other hand, in the learning efficiency of the ProtoNet
through the active selection of the support set in each iteration. This
framework was validated on UCI HAR/HAPT dataset and a real-world braking
maneuver dataset. The learning performance significantly surpasses traditional
AL algorithms on both datasets, achieving 90% classification accuracy with 10%
and 5% labeling effort, respectively.
Related papers
- Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - Towards Efficient Active Learning in NLP via Pretrained Representations [1.90365714903665]
Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications.
We drastically expedite this process by using pretrained representations of LLMs within the active learning loop.
Our strategy yields similar performance to fine-tuning all the way through the active learning loop but is orders of magnitude less computationally expensive.
arXiv Detail & Related papers (2024-02-23T21:28:59Z) - Active Learning Guided by Efficient Surrogate Learners [25.52920030051264]
Re-training a deep learning model each time a single data point receives a new label is impractical.
We introduce a new active learning algorithm that harnesses the power of a Gaussian process surrogate in conjunction with the neural network principal learner.
Our proposed model adeptly updates the surrogate learner for every new data instance, enabling it to emulate and capitalize on the continuous learning dynamics of the neural network.
arXiv Detail & Related papers (2023-01-07T01:35:25Z) - Active Learning Framework to Automate NetworkTraffic Classification [0.0]
The paper presents a novel ActiveLearning Framework (ALF) to address this topic.
ALF provides components that can be used to deploy an activelearning loop and maintain an ALF instance that continuouslyevolves a dataset and ML model.
The resultingsolution is deployable for IP flow-based analysis of high-speed(100 Gb/s) networks.
arXiv Detail & Related papers (2022-10-26T10:15:18Z) - Active Learning at the ImageNet Scale [43.595076693347835]
In this work, we study a combination of active learning (AL) and pretraining (SSP) on ImageNet.
We find that performance on small toy datasets is not representative of performance on ImageNet due to the class imbalanced samples selected by an active learner.
We propose Balanced Selection (BASE), a simple, scalable AL algorithm that outperforms random sampling consistently.
arXiv Detail & Related papers (2021-11-25T02:48:51Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Fase-AL -- Adaptation of Fast Adaptive Stacking of Ensembles for
Supporting Active Learning [0.0]
This work presents the FASE-AL algorithm which induces classification models with non-labeled instances using Active Learning.
The algorithm achieves promising results in terms of the percentage of correctly classified instances.
arXiv Detail & Related papers (2020-01-30T17:25:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.