Related papers: Data augmentation on-the-fly and active learning in data stream classification

Data augmentation on-the-fly and active learning in data stream classification

URL: http://arxiv.org/abs/2210.06873v1
Date: Thu, 13 Oct 2022 09:57:08 GMT
Title: Data augmentation on-the-fly and active learning in data stream classification
Authors: Kleanthis Malialis and Dimitris Papatheodoulou and Stylianos Filippou and Christos G. Panayiotou and Marios M. Polycarpou
Abstract summary: There is an emerging need for predictive models to be trained on-the-fly. Learning models have access to more labelled data without the need to increase the active learning budget. Augmented Queues significantly improves the performance in terms of learning quality and speed.
Score: 9.367903089535684
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: There is an emerging need for predictive models to be trained on-the-fly, since in numerous machine learning applications data are arriving in an online fashion. A critical challenge encountered is that of limited availability of ground truth information (e.g., labels in classification tasks) as new data are observed one-by-one online, while another significant challenge is that of class imbalance. This work introduces the novel Augmented Queues method, which addresses the dual-problem by combining in a synergistic manner online active learning, data augmentation, and a multi-queue memory to maintain separate and balanced queues for each class. We perform an extensive experimental study using image and time-series augmentations, in which we examine the roles of the active learning budget, memory size, imbalance level, and neural network type. We demonstrate two major advantages of Augmented Queues. First, it does not reserve additional memory space as the generation of synthetic data occurs only at training times. Second, learning models have access to more labelled data without the need to increase the active learning budget and / or the original memory size. Learning on-the-fly poses major challenges which, typically, hinder the deployment of learning models. Augmented Queues significantly improves the performance in terms of learning quality and speed. Our code is made publicly available.

Related papers

These Are Not All the Features You Are Looking For: A Fundamental Bottleneck in Supervised Pretraining [10.749875317643031]
Transfer learning is a cornerstone of modern machine learning, promising a way to adapt models pretrained on a broad mix of data to new tasks with minimal new data.<n>We evaluate model transfer from a pretraining mixture to each of its component tasks, assessing whether pretrained features can match the performance of task-specific direct training.<n>We identify a fundamental limitation in deep learning models, where networks fail to learn new features once they encode similar competing features during training.
arXiv Detail & Related papers (2025-06-23T01:04:29Z)
CSTA: Spatial-Temporal Causal Adaptive Learning for Exemplar-Free Video Class-Incremental Learning [62.69917996026769]
A class-incremental learning task requires learning and preserving both spatial appearance and temporal action involvement. We propose a framework that equips separate adapters to learn new class patterns, accommodating the incremental information requirements unique to each class. A causal compensation mechanism is proposed to reduce the conflicts during increment and memorization for between different types of information.
arXiv Detail & Related papers (2025-01-13T11:34:55Z)
Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal [54.93261535899478]
In real-world applications, such as robotic control of reinforcement learning, the tasks are changing, and new tasks arise in a sequential order. This situation poses the new challenge of plasticity-stability trade-off for training an agent who can adapt to task changes and retain acquired knowledge. We propose a rehearsal-based continual diffusion model, called Continual diffuser (CoD), to endow the diffuser with the capabilities of quick adaptation (plasticity) and lasting retention (stability)
arXiv Detail & Related papers (2024-09-04T08:21:47Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Complementary Learning Subnetworks for Parameter-Efficient Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks. Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z)
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes [100.69714600180895]
offline Q-learning algorithms exhibit strong performance that scales with model capacity. We train a single policy on 40 games with near-human performance using up-to 80 million parameter networks. Compared to return-conditioned supervised approaches, offline Q-learning scales similarly with model capacity and has better performance, especially when the dataset is suboptimal.
arXiv Detail & Related papers (2022-11-28T08:56:42Z)
A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from. Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes. One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z)
Nonstationary data stream classification with online active learning and siamese neural networks [11.501721946030779]
An emerging need for online learning methods that train predictive models on-the-fly. A series of open challenges, however, hinder their deployment in practice. We propose the ActiSiamese algorithm, which addresses these challenges by combining online active learning, siamese networks, and a multi-queue memory.
arXiv Detail & Related papers (2022-10-03T17:16:03Z)
Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models. Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z)
Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data [101.6195176510611]
"Online" continual learning enables evaluating both information retention and online learning efficacy. In online continual learning, each incoming small batch of data is first used for testing and then added to the training set, making the problem truly online. We introduce a new benchmark for online continual visual learning that exhibits large scale and natural distribution shifts.
arXiv Detail & Related papers (2021-08-20T06:17:20Z)
Data-efficient Online Classification with Siamese Networks and Active Learning [11.501721946030779]
We investigate learning from limited labelled, nonstationary and imbalanced data in online classification. We propose a learning method that synergistically combines siamese neural networks and active learning. Our study shows that the proposed method is robust to data nonstationarity and imbalance, and significantly outperforms baselines and state-of-the-art algorithms in terms of both learning speed and performance.
arXiv Detail & Related papers (2020-10-04T19:07:19Z)
Continual Prototype Evolution: Learning Online from Non-Stationary Data Streams [42.525141660788]
We introduce a system to enable learning and prediction at any point in time. In contrast to the major body of work in continual learning, data streams are processed in an online fashion. We obtain state-of-the-art performance by a significant margin on eight benchmarks, including three highly imbalanced data streams.
arXiv Detail & Related papers (2020-09-02T09:39:26Z)
A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks [1.1802674324027231]
Self-supervised pre-training for transfer learning is becoming an increasingly popular technique to improve state-of-the-art results using unlabeled data. We provide an overview of the taxonomy for self-supervised learning and transfer learning, and highlight some prominent methods for designing pre-training tasks across different domains.
arXiv Detail & Related papers (2020-07-01T22:55:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.