Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching
- URL: http://arxiv.org/abs/2311.17950v3
- Date: Sun, 17 Mar 2024 03:56:13 GMT
- Title: Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching
- Authors: Shitong Shao, Zeyuan Yin, Muxin Zhou, Xindong Zhang, Zhiqiang Shen,
- Abstract summary: Generalized Various Backbone and Statistical Matching (G-VBSM) is first algorithm to obtain strong performance across both small-scale and large-scale datasets.
G-VBSM achieves a performance of 38.7% on CIFAR-100 with 128-width ConvNet, 47.6% on Tiny-ImageNet with ResNet18, and 31.4% on the full 224x224 ImageNet-1k with ResNet18.
- Score: 24.45182507244476
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The lightweight "local-match-global" matching introduced by SRe2L successfully creates a distilled dataset with comprehensive information on the full 224x224 ImageNet-1k. However, this one-sided approach is limited to a particular backbone, layer, and statistics, which limits the improvement of the generalization of a distilled dataset. We suggest that sufficient and various "local-match-global" matching are more precise and effective than a single one and has the ability to create a distilled dataset with richer information and better generalization. We call this perspective "generalized matching" and propose Generalized Various Backbone and Statistical Matching (G-VBSM) in this work, which aims to create a synthetic dataset with densities, ensuring consistency with the complete dataset across various backbones, layers, and statistics. As experimentally demonstrated, G-VBSM is the first algorithm to obtain strong performance across both small-scale and large-scale datasets. Specifically, G-VBSM achieves a performance of 38.7% on CIFAR-100 with 128-width ConvNet, 47.6% on Tiny-ImageNet with ResNet18, and 31.4% on the full 224x224 ImageNet-1k with ResNet18, under images per class (IPC) 10, 50, and 10, respectively. These results surpass all SOTA methods by margins of 3.9%, 6.5%, and 10.1%, respectively.
Related papers
- A Novel Adaptive Fine-Tuning Algorithm for Multimodal Models: Self-Optimizing Classification and Selection of High-Quality Datasets in Remote Sensing [46.603157010223505]
We propose an adaptive fine-tuning algorithm for multimodal large models.
We train the model on two 3090 GPU using one-third of the GeoChat multimodal remote sensing dataset.
The model achieved scores of 89.86 and 77.19 on the UCMerced and AID evaluation datasets.
arXiv Detail & Related papers (2024-09-20T09:19:46Z) - Decorrelating Structure via Adapters Makes Ensemble Learning Practical for Semi-supervised Learning [50.868594148443215]
In computer vision, traditional ensemble learning methods exhibit either a low training efficiency or the limited performance.
We propose a lightweight, loss-function-free, and architecture-agnostic ensemble learning by the Decorrelating Structure via Adapters (DSA) for various visual tasks.
arXiv Detail & Related papers (2024-08-08T01:31:38Z) - Dataset Distillation in Large Data Era [31.758821805424393]
We show how to distill various large-scale datasets such as full ImageNet-1K/21K under a conventional input resolution of 224$times$224.
We show that the proposed model beats the current state-of-the-art by more than 4% Top-1 accuracy on ImageNet-1K/21K.
arXiv Detail & Related papers (2023-11-30T18:59:56Z) - Squeeze, Recover and Relabel: Dataset Condensation at ImageNet Scale
From A New Perspective [27.650434284271363]
Under 50 IPC, our approach achieves the highest 42.5% and 60.8% validation accuracy on Tiny-ImageNet and ImageNet-1K datasets.
Our approach also surpasses MTT in terms of speed by approximately 52$times$ (ConvNet-4) and 16$times$ (ResNet-18) faster with less memory consumption of 11.6$times$ and 6.4$times$ during data synthesis.
arXiv Detail & Related papers (2023-06-22T17:59:58Z) - The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation
via Point-Guided Mask Representation [61.027468209465354]
We introduce a novel learning scheme named weakly semi-supervised instance segmentation (WSSIS) with point labels.
We propose a method for WSSIS that can effectively leverage the budget-friendly point labels as a powerful weak supervision source.
We conduct extensive experiments on COCO and BDD100K datasets, and the proposed method achieves promising results comparable to those of the fully-supervised model.
arXiv Detail & Related papers (2023-03-27T10:11:22Z) - Semi-Supervised Image Captioning by Adversarially Propagating Labeled
Data [95.0476489266988]
We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models.
Our proposed method trains a captioner to learn from a paired data and to progressively associate unpaired data.
Our extensive and comprehensive empirical results both on (1) image-based and (2) dense region-based captioning datasets followed by comprehensive analysis on the scarcely-paired dataset.
arXiv Detail & Related papers (2023-01-26T15:25:43Z) - Feature transforms for image data augmentation [74.12025519234153]
In image classification, many augmentation approaches utilize simple image manipulation algorithms.
In this work, we build ensembles on the data level by adding images generated by combining fourteen augmentation approaches.
Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method.
arXiv Detail & Related papers (2022-01-24T14:12:29Z) - MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains.
We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.
A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z) - Revisiting Global Statistics Aggregation for Improving Image Restoration [8.803962179239385]
Test-time Local Statistics Converter (TLSC) significantly improves image restorer's performance.
By extending SE with TLSC to the state-of-the-art models, MPRNet boost by 0.65 dB in PSNR on GoPro dataset, achieves 33.31 dB, exceeds the previous best result 0.6 dB.
arXiv Detail & Related papers (2021-12-08T12:52:14Z) - CSC-Unet: A Novel Convolutional Sparse Coding Strategy Based Neural
Network for Semantic Segmentation [0.44289311505645573]
We propose a novel strategy that reformulated the popularly-used convolution operation to multi-layer convolutional sparse coding block.
We show that the multi-layer convolutional sparse coding block enables semantic segmentation model to converge faster, can extract finer semantic and appearance information of images, and improve the ability to recover spatial detail information.
arXiv Detail & Related papers (2021-08-01T09:16:31Z) - DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort [117.41383937100751]
Current deep networks are extremely data-hungry, benefiting from training on large-scale datasets.
We show how the GAN latent code can be decoded to produce a semantic segmentation of the image.
These generated datasets can then be used for training any computer vision architecture just as real datasets are.
arXiv Detail & Related papers (2021-04-13T20:08:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.