Related papers: Harmonized Gradient Descent for Class Imbalanced Data Stream Online Learning

Harmonized Gradient Descent for Class Imbalanced Data Stream Online Learning

URL: http://arxiv.org/abs/2508.11353v1
Date: Fri, 15 Aug 2025 09:35:13 GMT
Title: Harmonized Gradient Descent for Class Imbalanced Data Stream Online Learning
Authors: Han Zhou, Hongpeng Yin, Xuanhong Deng, Yuyu Huang, Hao Ren,
Abstract summary: We introduce the harmonized gradient descent (HGD) algorithm, which aims to equalize the norms of gradients across different classes.<n>By ensuring the gradient norm balance, HGD mitigates under-fitting for minor classes and achieves balanced online learning.
Score: 10.398611591652031
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many real-world data are sequentially collected over time and often exhibit skewed class distributions, resulting in imbalanced data streams. While existing approaches have explored several strategies, such as resampling and reweighting, for imbalanced data stream learning, our work distinguishes itself by addressing the imbalance problem through training modification, particularly focusing on gradient descent techniques. We introduce the harmonized gradient descent (HGD) algorithm, which aims to equalize the norms of gradients across different classes. By ensuring the gradient norm balance, HGD mitigates under-fitting for minor classes and achieves balanced online learning. Notably, HGD operates in a streamlined implementation process, requiring no data-buffer, extra parameters, or prior knowledge, making it applicable to any learning models utilizing gradient descent for optimization. Theoretical analysis, based on a few common and mild assumptions, shows that HGD achieves a satisfied sub-linear regret bound. The proposed algorithm are compared with the commonly used online imbalance learning methods under several imbalanced data stream scenarios. Extensive experimental evaluations demonstrate the efficiency and effectiveness of HGD in learning imbalanced data streams.

Related papers

GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning [55.03441672267886]
We propose GradAlign, a gradient-aligned data selection method for reinforcement learning.<n>We evaluate GradAlign across three data regimes: unreliable reward signals, distribution imbalance, and low-utility training corpus.
arXiv Detail & Related papers (2026-02-25T01:54:50Z)
Online-BLS: An Accurate and Efficient Online Broad Learning System for Data Stream Classification [52.251569042852815]
We introduce an online broad learning system framework with closed-form solutions for each online update.<n>We design an effective weight estimation algorithm and an efficient online updating strategy.<n>Our framework is naturally extended to data stream scenarios with concept drift and exceeds state-of-the-art baselines.
arXiv Detail & Related papers (2025-01-28T13:21:59Z)
An Effective Dynamic Gradient Calibration Method for Continual Learning [11.555822066922508]
Continual learning (CL) is a fundamental topic in machine learning, where the goal is to train a model with continuously incoming data and tasks. Due to the memory limit, we cannot store all the historical data, and therefore confront the catastrophic forgetting'' problem. We develop an effective algorithm to calibrate the gradient in each updating step of the model.
arXiv Detail & Related papers (2024-07-30T16:30:09Z)
On Improving the Algorithm-, Model-, and Data- Efficiency of Self-Supervised Learning [18.318758111829386]
We propose an efficient single-branch SSL method based on non-parametric instance discrimination. We also propose a novel self-distillation loss that minimizes the KL divergence between the probability distribution and its square root version.
arXiv Detail & Related papers (2024-04-30T06:39:04Z)
Gradient Reweighting: Towards Imbalanced Class-Incremental Learning [8.438092346233054]
Class-Incremental Learning (CIL) trains a model to continually recognize new classes from non-stationary data. A major challenge of CIL arises when applying to real-world data characterized by non-uniform distribution. We show that this dual imbalance issue causes skewed gradient updates with biased weights in FC layers, thus inducing over/under-fitting and catastrophic forgetting in CIL.
arXiv Detail & Related papers (2024-02-28T18:08:03Z)
Simplifying Neural Network Training Under Class Imbalance [77.39968702907817]
Real-world datasets are often highly class-imbalanced, which can adversely impact the performance of deep learning models. The majority of research on training neural networks under class imbalance has focused on specialized loss functions, sampling techniques, or two-stage training procedures. We demonstrate that simply tuning existing components of standard deep learning pipelines, such as the batch size, data augmentation, and label smoothing, can achieve state-of-the-art performance without any such specialized class imbalance methods.
arXiv Detail & Related papers (2023-12-05T05:52:44Z)
An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised Learning [103.65758569417702]
Semi-supervised learning (SSL) has shown great promise in leveraging unlabeled data to improve model performance. We consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data. We study a simple yet overlooked baseline -- SimiS -- which tackles data imbalance by simply supplementing labeled data with pseudo-labels.
arXiv Detail & Related papers (2022-11-20T21:18:41Z)
A Theoretical Analysis of the Learning Dynamics under Class Imbalance [0.10231119246773925]
We show that the learning curves for minority and majority classes follow sub-optimal trajectories when training with a gradient-based trajectory. This slowdown is related to the imbalance ratio and can be traced back to a competition between the optimization of different classes. We find that GD is not guaranteed to decrease the loss for each class but that this problem can be addressed by performing a per-class normalization of the gradient.
arXiv Detail & Related papers (2022-07-01T12:54:38Z)
CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance. Sample re-weighting methods are popularly used to alleviate this data bias issue. We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z)
Bilevel Online Deep Learning in Non-stationary Environment [4.565872584112864]
Bilevel Online Deep Learning (BODL) framework combines bilevel optimization strategy and online ensemble classifier. When the concept drift is detected, our BODL algorithm can adaptively update the model parameters via bilevel optimization and then circumvent the large drift and encourage positive transfer.
arXiv Detail & Related papers (2022-01-25T11:05:51Z)
Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning. Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch. ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.