Related papers: DUSE: A Data Expansion Framework for Low-resource Automatic Modulation Recognition based on Active Learning

DUSE: A Data Expansion Framework for Low-resource Automatic Modulation Recognition based on Active Learning

URL: http://arxiv.org/abs/2507.12011v1
Date: Wed, 16 Jul 2025 08:09:41 GMT
Title: DUSE: A Data Expansion Framework for Low-resource Automatic Modulation Recognition based on Active Learning
Authors: Yao Lu, Hongyu Gao, Zhuangzhi Chen, Dongwei Xu, Yun Lin, Qi Xuan, Guan Gui,
Abstract summary: We introduce a data expansion framework called Dynamic Uncertainty-driven Sample Expansion (DUSE)<n>DUSE uses an uncertainty scoring function to filter out useful samples from relevant AMR datasets.<n>Experiments demonstrate that DUSE consistently outperforms 8 coreset selection baselines in both class-balance and class-imbalance settings.
Score: 17.651073556023167
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although deep neural networks have made remarkable achievements in the field of automatic modulation recognition (AMR), these models often require a large amount of labeled data for training. However, in many practical scenarios, the available target domain data is scarce and difficult to meet the needs of model training. The most direct way is to collect data manually and perform expert annotation, but the high time and labor costs are unbearable. Another common method is data augmentation. Although it can enrich training samples to a certain extent, it does not introduce new data and therefore cannot fundamentally solve the problem of data scarcity. To address these challenges, we introduce a data expansion framework called Dynamic Uncertainty-driven Sample Expansion (DUSE). Specifically, DUSE uses an uncertainty scoring function to filter out useful samples from relevant AMR datasets and employs an active learning strategy to continuously refine the scorer. Extensive experiments demonstrate that DUSE consistently outperforms 8 coreset selection baselines in both class-balance and class-imbalance settings. Besides, DUSE exhibits strong cross-architecture generalization for unseen models.

Related papers

A Differentiable Adversarial Framework for Task-Aware Data Subsampling [0.5371337604556311]
We introduce the antagonistic soft selection subsampling (ASSS) framework as a novel paradigm that reconstructs data reduction into a differentiable end-to-end learning problem.<n>This work establishes task aware data subsampling as a learnable component, providing a principled solution for effective large-scale data learning.
arXiv Detail & Related papers (2026-01-05T13:10:09Z)
Forgetting-MarI: LLM Unlearning via Marginal Information Regularization [6.979586479353831]
Existing unlearning methods often degrade model performance by removing more information than necessary when attempting to ''forget'' specific data.<n>We introduce Forgetting-MarI, an LLM unlearning framework that provably removes only the additional (marginal) information contributed by the data to be unlearned.<n>By penalizing marginal information, our method yields an explicit upper bound on the unlearn dataset's residual influence in the trained models, providing provable undetectability.
arXiv Detail & Related papers (2025-11-14T22:48:39Z)
Does This Look Familiar to You? Knowledge Analysis via Model Internal Representations [0.0]
There is no clearly established methodology for effective training data selection.<n>Model Internal Representations (KAMIR) is a novel approach that overcomes these limitations.<n>It can be applied to a wide range of tasks such as machine reading comprehension and summarization.
arXiv Detail & Related papers (2025-09-09T01:08:15Z)
SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models [51.74498855100541]
Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL)<n>We propose textbfSPaRFT, a self-paced learning framework that enables efficient learning based on the capability of the model being trained.
arXiv Detail & Related papers (2025-08-07T03:50:48Z)
PEAKS: Selecting Key Training Examples Incrementally via Prediction Error Anchored by Kernel Similarity [6.6157730528755065]
We study the Incremental Data Selection (IDS) problem, where examples arrive as a continuous stream, and need to be selected without access to the full data source.<n>We propose PEAKS, an efficient data selection method tailored for IDS.<n>Our comprehensive evaluations demonstrate that PEAKS consistently outperforms existing selection strategies.
arXiv Detail & Related papers (2025-04-07T16:42:09Z)
Propensity-driven Uncertainty Learning for Sample Exploration in Source-Free Active Domain Adaptation [19.620523416385346]
Source-free active domain adaptation (SFADA) addresses the challenge of adapting a pre-trained model to new domains without access to source data.<n>This scenario is particularly relevant in real-world applications where data privacy, storage limitations, or labeling costs are significant concerns.<n>We propose the Propensity-driven Uncertainty Learning (ProULearn) framework to effectively select more informative samples without frequently requesting human annotations.
arXiv Detail & Related papers (2025-01-23T10:05:25Z)
Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data [54.934578742209716]
In real-world NLP applications, Large Language Models (LLMs) offer promising solutions due to their extensive training on vast datasets.<n>LLKD is an adaptive sample selection method that incorporates signals from both the teacher and student.<n>Our comprehensive experiments show that LLKD achieves superior performance across various datasets with higher data efficiency.
arXiv Detail & Related papers (2024-11-12T18:57:59Z)
SUDS: A Strategy for Unsupervised Drift Sampling [0.5437605013181142]
Supervised machine learning encounters concept drift, where the data distribution changes over time, degrading performance. We present the Strategy for Drift Sampling (SUDS), a novel method that selects homogeneous samples for retraining using existing drift detection algorithms. Our results demonstrate the efficacy of SUDS in optimizing labeled data use in dynamic environments.
arXiv Detail & Related papers (2024-11-05T10:55:29Z)
Test-Time Adaptation for Combating Missing Modalities in Egocentric Videos [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.<n>We propose a novel approach to address this issue at test time without requiring retraining.<n>MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z)
Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box. This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z)
SRoUDA: Meta Self-training for Robust Unsupervised Domain Adaptation [25.939292305808934]
Unsupervised domain adaptation (UDA) can transfer knowledge learned from rich-label dataset to unlabeled target dataset. In this paper, we present a new meta self-training pipeline, named SRoUDA, for improving adversarial robustness of UDA models.
arXiv Detail & Related papers (2022-12-12T14:25:40Z)
Frugal Reinforcement-based Active Learning [12.18340575383456]
We propose a novel active learning approach for label-efficient training. The proposed method is iterative and aims at minimizing a constrained objective function that mixes diversity, representativity and uncertainty criteria. We also introduce a novel weighting mechanism based on reinforcement learning, which adaptively balances these criteria at each training iteration.
arXiv Detail & Related papers (2022-12-09T14:17:45Z)
Hyperparameter-free Continuous Learning for Domain Classification in Natural Language Understanding [60.226644697970116]
Domain classification is the fundamental task in natural language understanding (NLU) Most existing continual learning approaches suffer from low accuracy and performance fluctuation. We propose a hyper parameter-free continual learning model for text data that can stably produce high performance under various environments.
arXiv Detail & Related papers (2022-01-05T02:46:16Z)
Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training. We experimentally verify that the new dataset can significantly improve the ability of the learned FER model. To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.