Analytic Incremental Learning For Sound Source Localization With Imbalance Rectification
- URL: http://arxiv.org/abs/2601.18335v1
- Date: Mon, 26 Jan 2026 10:22:01 GMT
- Title: Analytic Incremental Learning For Sound Source Localization With Imbalance Rectification
- Authors: Zexia Fan, Yu Chen, Qiquan Zhang, Kainan Chen, Xinyuan Qian,
- Abstract summary: Sound source localization (SSL) demonstrates remarkable results in controlled settings but struggles in real-world deployment.<n>We propose a unified framework with two key innovations to mitigate these issues.<n>On the SSLR benchmark, our proposal achieves state-of-the-art (SoTA) results of 89.0% accuracy, 5.3 mean absolute error, and 1.6 backward transfer robustness.
- Score: 15.161051314343105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sound source localization (SSL) demonstrates remarkable results in controlled settings but struggles in real-world deployment due to dual imbalance challenges: intra-task imbalance arising from long-tailed direction-of-arrival (DoA) distributions, and inter-task imbalance induced by cross-task skews and overlaps. These often lead to catastrophic forgetting, significantly degrading the localization accuracy. To mitigate these issues, we propose a unified framework with two key innovations. Specifically, we design a GCC-PHAT-based data augmentation (GDA) method that leverages peak characteristics to alleviate intra-task distribution skews. We also propose an Analytic dynamic imbalance rectifier (ADIR) with task-adaption regularization, which enables analytic updates that adapt to inter-task dynamics. On the SSLR benchmark, our proposal achieves state-of-the-art (SoTA) results of 89.0% accuracy, 5.3° mean absolute error, and 1.6 backward transfer, demonstrating robustness to evolving imbalances without exemplar storage.
Related papers
- A Replicate-and-Quantize Strategy for Plug-and-Play Load Balancing of Sparse Mixture-of-Experts LLMs [64.8510381475827]
Sparse Mixture-of-Experts (SMoE) architectures are increasingly used to scale large language models efficiently.<n>SMoE models often suffer from severe load imbalance across experts, where a small subset of experts receives most tokens while others are underutilized.<n>We present a systematic analysis of expert routing during inference and identify three findings: (i) load imbalance persists and worsens with larger batch sizes, (ii) selection frequency does not reliably reflect expert importance, and (iii) overall expert workload and importance can be estimated using a small calibration set.
arXiv Detail & Related papers (2026-02-23T15:11:16Z) - Generalizing GNNs with Tokenized Mixture of Experts [75.8310720413187]
We show that improving stability requires reducing reliance on shift-sensitive features, leaving an irreducible worst-case generalization floor.<n>We propose STEM-GNN, a pretrain-then-finetune framework with a mixture-of-experts encoder for diverse computation paths.<n>Across nine node, link, and graph benchmarks, STEM-GNN achieves a stronger three-way balance, improving robustness to degree/homophily shifts and to feature/edge corruptions while remaining competitive on clean graphs.
arXiv Detail & Related papers (2026-02-09T22:48:30Z) - Deep Time-series Forecasting Needs Kernelized Moment Balancing [56.619037429652984]
Deep time-series forecasting can be formulated as a distribution balancing problem aimed at aligning the distribution of the forecasts and ground truths.<n>We propose direct forecasting with kernelized moment balancing (KMB-DF)<n>Experiments across multiple models and datasets show that KMB-DF consistently improves forecasting accuracy and achieves state-of-the-art performance.
arXiv Detail & Related papers (2026-01-31T13:20:18Z) - Quantifying Multimodal Imbalance: A GMM-Guided Adaptive Loss for Audio-Visual Learning [12.236332735708473]
Existing solutions mainly focus on optimization- or data-based strategies but rarely exploit the information inherent in multimodal imbalance.<n>We propose a novel quantitative analysis framework for Multimodal Imbalance and design a sample-level adaptive loss function.
arXiv Detail & Related papers (2025-10-20T15:42:43Z) - Dual-granularity Sinkhorn Distillation for Enhanced Learning from Long-tailed Noisy Data [67.25796812343454]
Real-world datasets for deep learning frequently suffer from the co-occurring challenges of class imbalance and label noise.<n>We propose Dual-granularity Sinkhorn Distillation (D-SINK), a novel framework that enhances dual robustness by distilling and integrating complementary insights.<n>Experiments on benchmark datasets demonstrate that D-SINK significantly improves robustness and achieves strong empirical performance in learning from long-tailed noisy data.
arXiv Detail & Related papers (2025-10-09T13:05:27Z) - Sycophancy Mitigation Through Reinforcement Learning with Uncertainty-Aware Adaptive Reasoning Trajectories [58.988535279557546]
We introduce textbf sycophancy Mitigation through Adaptive Reasoning Trajectories.<n>We show that SMART significantly reduces sycophantic behavior while preserving strong performance on out-of-distribution inputs.
arXiv Detail & Related papers (2025-09-20T17:09:14Z) - TOAST: Task-Oriented Adaptive Semantic Transmission over Dynamic Wireless Environments [3.3107717550009865]
TOAST (Task-Oriented Adaptive Semantic Transmission) is a unified framework designed to address the core challenge of multi-task optimization in wireless environments.<n>We formulate adaptive task balancing as a Markov decision process, employing deep reinforcement learning to dynamically adjust the trade-off between image reconstruction fidelity and semantic classification accuracy.<n>We integrate module-specific Low-Rank Adaptation (LoRA) mechanisms throughout our Swin Transformer-based joint source-channel coding architecture.
arXiv Detail & Related papers (2025-06-27T04:36:30Z) - Reliable Few-shot Learning under Dual Noises [166.53173694689693]
We propose DEnoised Task Adaptation (DETA++) for reliable few-shot learning.<n>DETA++ employs a memory bank to store and refine clean regions for each inner-task class, based on which a Local Nearestid (LocalNCC) is devised to yield noise-robust predictions on query samples.<n>Extensive experiments demonstrate the effectiveness and flexibility of DETA++.
arXiv Detail & Related papers (2025-06-19T14:05:57Z) - Controlled Data Rebalancing in Multi-Task Learning for Real-World Image Super-Resolution [51.79973519845773]
Real-world image super-resolution (Real-SR) is a challenging problem due to the complex degradation patterns in low-resolution images.<n>We propose an improved paradigm that frames Real-SR as a data-heterogeneous multi-task learning problem.
arXiv Detail & Related papers (2025-06-05T21:40:21Z) - Error Distribution Smoothing:Advancing Low-Dimensional Imbalanced Regression [2.435853975142516]
In real-world regression tasks, datasets frequently exhibit imbalanced distributions, characterized by a scarcity of data in high-complexity regions and an abundance in low-complexity areas.<n>We introduce a novel concept of Imbalanced Regression, which takes into account both the complexity of the problem and the density of data points, extending beyond traditional definitions that focus only on data density.<n>We propose Error Distribution Smoothing (EDS) as a solution to tackle imbalanced regression, effectively selecting a representative subset from the dataset to reduce redundancy while maintaining balance and representativeness.
arXiv Detail & Related papers (2025-02-04T12:40:07Z) - Energy Score-based Pseudo-Label Filtering and Adaptive Loss for Imbalanced Semi-supervised SAR target recognition [1.2035771704626825]
Existing semi-supervised SAR ATR algorithms show low recognition accuracy in the case of class imbalance.
This work offers a non-balanced semi-supervised SAR target recognition approach using dynamic energy scores and adaptive loss.
arXiv Detail & Related papers (2024-11-06T14:45:16Z) - Causal Balancing for Domain Generalization [95.97046583437145]
We propose a balanced mini-batch sampling strategy to reduce the domain-specific spurious correlations in observed training distributions.
We provide an identifiability guarantee of the source of spuriousness and show that our proposed approach provably samples from a balanced, spurious-free distribution.
arXiv Detail & Related papers (2022-06-10T17:59:11Z) - Towards Overcoming False Positives in Visual Relationship Detection [95.15011997876606]
We investigate the cause of the high false positive rate in Visual Relationship Detection (VRD)
This paper presents Spatially-Aware Balanced negative pRoposal sAmpling (SABRA) as a robust VRD framework that alleviates the influence of false positives.
arXiv Detail & Related papers (2020-12-23T06:28:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.