Finding the Muses: Identifying Coresets through Loss Trajectories
- URL: http://arxiv.org/abs/2503.09721v1
- Date: Wed, 12 Mar 2025 18:11:16 GMT
- Title: Finding the Muses: Identifying Coresets through Loss Trajectories
- Authors: Manish Nagaraj, Deepak Ravikumar, Efstathia Soufleri, Kaushik Roy,
- Abstract summary: Loss Trajectory Correlation (LTC) is a novel metric for coreset selection that identifies critical training samples driving generalization.<n>$LTC$ consistently achieves accuracy on par with or surpassing state-of-the-art coreset selection methods.<n>It also offers insights into training dynamics, such as identifying aligned and conflicting sample behaviors.
- Score: 7.293244528299574
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models achieve state-of-the-art performance across domains but face scalability challenges in real-time or resource-constrained scenarios. To address this, we propose Loss Trajectory Correlation (LTC), a novel metric for coreset selection that identifies critical training samples driving generalization. $LTC$ quantifies the alignment between training sample loss trajectories and validation set loss trajectories, enabling the construction of compact, representative subsets. Unlike traditional methods with computational and storage overheads that are infeasible to scale to large datasets, $LTC$ achieves superior efficiency as it can be computed as a byproduct of training. Our results on CIFAR-100 and ImageNet-1k show that $LTC$ consistently achieves accuracy on par with or surpassing state-of-the-art coreset selection methods, with any differences remaining under 1%. LTC also effectively transfers across various architectures, including ResNet, VGG, DenseNet, and Swin Transformer, with minimal performance degradation (<2%). Additionally, LTC offers insights into training dynamics, such as identifying aligned and conflicting sample behaviors, at a fraction of the computational cost of traditional methods. This framework paves the way for scalable coreset selection and efficient dataset optimization.
Related papers
- Model-agnostic Coreset Selection via LLM-based Concept Bottlenecks [6.857632954159568]
Coreset Selection (CS) identifies a subset of training data that achieves model performance comparable to using the entire dataset.<n>These scores are inefficient to compute and hard to interpret as they do not indicate whether a sample is difficult to learn in general or only for a specific model.<n>Our work proposes an interpretable score that gauges a sample's difficulty using human-understandable textual attributes (concepts) independent of any downstream model.
arXiv Detail & Related papers (2025-02-23T22:14:42Z) - Adaptive Dataset Quantization [2.0105434963031463]
We introduce a versatile framework for dataset compression, namely Adaptive dataset Quantization (ADQ)
We propose a novel adaptive sampling strategy through the evaluation of generated bins' representativeness score, diversity score and importance score.
Our method not only exhibits superior generalization capability across different architectures, but also attains state-of-the-art results, surpassing DQ by average 3% on various datasets.
arXiv Detail & Related papers (2024-12-22T07:08:29Z) - Towards Sustainable Learning: Coresets for Data-efficient Deep Learning [9.51481812606879]
CREST is the first scalable subsetx deep networks framework with rigorous theoretical subsetx experiments on datasets.
CREST identifies the most valuable examples of a non-Image function.
arXiv Detail & Related papers (2023-06-02T02:51:08Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - Adaptive Second Order Coresets for Data-efficient Machine Learning [5.362258158646462]
Training machine learning models on datasets incurs substantial computational costs.
We propose AdaCore to extract subsets of the training examples for efficient machine learning.
arXiv Detail & Related papers (2022-07-28T05:43:09Z) - DANCE: DAta-Network Co-optimization for Efficient Segmentation Model
Training and Inference [85.02494022662505]
DANCE is an automated simultaneous data-network co-optimization for efficient segmentation model training and inference.
It integrates automated data slimming which adaptively downsamples/drops input images and controls their corresponding contribution to the training loss guided by the images' spatial complexity.
Experiments and ablating studies demonstrate that DANCE can achieve "all-win" towards efficient segmentation.
arXiv Detail & Related papers (2021-07-16T04:58:58Z) - Layer Pruning on Demand with Intermediate CTC [50.509073206630994]
We present a training and pruning method for ASR based on the connectionist temporal classification (CTC)
We show that a Transformer-CTC model can be pruned in various depth on demand, improving real-time factor from 0.005 to 0.002 on GPU.
arXiv Detail & Related papers (2021-06-17T02:40:18Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.