Related papers: Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning

Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning

URL: http://arxiv.org/abs/2507.12750v1
Date: Thu, 17 Jul 2025 03:08:26 GMT
Title: Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning
Authors: Suorong Yang, Peijia Li, Yujie Liu, Zhiming Xu, Peng Ye, Wanli Ouyang, Furao Shen, Dongzhan Zhou,
Abstract summary: We introduce a dynamic dataset pruning framework that adaptively selects training samples based on task-driven difficulty and cross-modality semantic consistency.<n>Our work highlights the potential of integrating cross-modality alignment for robust sample selection, advancing data-centric learning toward more efficient and robust practices across application domains.
Score: 49.10890099624699
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern deep models are trained on large real-world datasets, where data quality varies and redundancy is common. Data-centric approaches such as dataset pruning have shown promise in improving training efficiency and model performance. However, most existing methods rely on static heuristics or task-specific metrics, limiting their robustness and generalizability across domains. In this work, we introduce a dynamic dataset pruning framework that adaptively selects training samples based on both task-driven difficulty and cross-modality semantic consistency. By incorporating supervision from pretrained multimodal foundation models, our approach captures training dynamics while effectively filtering out uninformative samples. Our work highlights the potential of integrating cross-modality alignment for robust sample selection, advancing data-centric learning toward more efficient and robust practices across application domains.

Related papers

IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment [29.703775936837012]
Large Language Models (LLMs) have achieved impressive performance through Supervised Fine-tuning (SFT) on diverse instructional datasets.<n>When training on multiple capabilities simultaneously, the mixture training dataset, governed by volumes of data from different domains, is a critical factor that directly impacts the final model's performance.<n>We introduce an innovative data equilibrium framework designed to effectively optimize volumes of data from different domains within mixture SFT datasets.
arXiv Detail & Related papers (2025-05-19T06:42:44Z)
A CLIP-Powered Framework for Robust and Generalizable Data Selection [51.46695086779598]
Real-world datasets often contain redundant and noisy data, imposing a negative impact on training efficiency and model performance.<n>Data selection has shown promise in identifying the most representative samples from the entire dataset.<n>We propose a novel CLIP-powered data selection framework that leverages multimodal information for more robust and generalizable sample selection.
arXiv Detail & Related papers (2024-10-15T03:00:58Z)
Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs [48.406728896785296]
We propose a novel approach to automatically construct a unified label space across multiple datasets using graph neural networks.<n>Unlike existing methods, our approach facilitates seamless training without the need for additional manual reannotation or taxonomy reconciliation.
arXiv Detail & Related papers (2024-07-15T08:42:10Z)
Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data. For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z)
ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP) ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective. We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z)
Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem. Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z)
Learning Sequential Latent Variable Models from Multimodal Time Series Data [6.107812768939553]
We present a self-supervised generative modelling framework to jointly learn a probabilistic latent state representation of multimodal data. We demonstrate that our approach leads to significant improvements in prediction and representation quality.
arXiv Detail & Related papers (2022-04-21T21:59:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.