Dataset Color Quantization: A Training-Oriented Framework for Dataset-Level Compression
- URL: http://arxiv.org/abs/2602.20650v2
- Date: Sun, 01 Mar 2026 13:55:52 GMT
- Title: Dataset Color Quantization: A Training-Oriented Framework for Dataset-Level Compression
- Authors: Chenyue Yu, Lingao Xiao, Jinhong Deng, Ivor W. Tsang, Yang He,
- Abstract summary: We propose a unified framework that compresses visual datasets by reducing color-space redundancy while preserving information crucial for model training.<n>Experiments across CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet-1K show that DCQ significantly improves training performance under aggressive compression.
- Score: 56.02211851256951
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale image datasets are fundamental to deep learning, but their high storage demands pose challenges for deployment in resource-constrained environments. While existing approaches reduce dataset size by discarding samples, they often ignore the significant redundancy within each image -- particularly in the color space. To address this, we propose Dataset Color Quantization (DCQ), a unified framework that compresses visual datasets by reducing color-space redundancy while preserving information crucial for model training. DCQ achieves this by enforcing consistent palette representations across similar images, selectively retaining semantically important colors guided by model perception, and maintaining structural details necessary for effective feature learning. Extensive experiments across CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet-1K show that DCQ significantly improves training performance under aggressive compression, offering a scalable and robust solution for dataset-level storage reduction.
Related papers
- RAW-Flow: Advancing RGB-to-RAW Image Reconstruction with Deterministic Latent Flow Matching [55.03149221192589]
We introduce a novel framework named RAW-Flow to bridge the gap between RGB and RAW representations.<n>We also introduce a cross-scale context guidance module that injects hierarchical RGB features into the flow estimation process.<n> RAW-Flow outperforms state-of-the-art approaches both quantitatively and visually.
arXiv Detail & Related papers (2026-01-28T08:27:38Z) - Convolutional Deep Colorization for Image Compression: A Color Grid Based Approach [0.0]
This work focuses on optimizing a color grid based approach to fully-automated image color information retention.<n>We want to minimize the amount of color information that is stored while still being able to faithfully re-color images.<n>Our results yielded a promising image compression ratio, while still allowing for successful image recolorization reaching high CSIM values.
arXiv Detail & Related papers (2025-02-08T01:26:05Z) - Leveraging Color Channel Independence for Improved Unsupervised Object Detection [7.030688465389997]
We challenge the common assumption that RGB images are the optimal color space for unsupervised learning in computer vision.<n>We show that models improve when requiring them to predict additional color channels.<n>The use of composite color spaces can be implemented with basically no computational overhead.
arXiv Detail & Related papers (2024-12-19T18:28:37Z) - Color-Oriented Redundancy Reduction in Dataset Distillation [35.83163170289415]
We propose a framework that minimizes color redundancy at the individual image and overall dataset levels.<n>At the image level, we employ a palette network, a specialized neural network, to dynamically allocate colors from a reduced color space to each pixel.<n>A comprehensive performance study is conducted, demonstrating the superior performance of our proposed color-aware DD compared to existing DD methods.
arXiv Detail & Related papers (2024-11-18T06:48:11Z) - Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - Raw Image Reconstruction with Learned Compact Metadata [61.62454853089346]
We propose a novel framework to learn a compact representation in the latent space serving as the metadata in an end-to-end manner.
We show how the proposed raw image compression scheme can adaptively allocate more bits to image regions that are important from a global perspective.
arXiv Detail & Related papers (2023-02-25T05:29:45Z) - ParaColorizer: Realistic Image Colorization using Parallel Generative
Networks [1.7778609937758327]
Grayscale image colorization is a fascinating application of AI for information restoration.
We present a parallel GAN-based colorization framework.
We show the shortcomings of the non-perceptual evaluation metrics commonly used to assess multi-modal problems.
arXiv Detail & Related papers (2022-08-17T13:49:44Z) - Image Colorization: A Survey and Dataset [94.59768013860668]
This article presents a comprehensive survey of state-of-the-art deep learning-based image colorization techniques.
It categorizes the existing colorization techniques into seven classes and discusses important factors governing their performance.
We perform an extensive experimental evaluation of existing image colorization methods using both existing datasets and our proposed one.
arXiv Detail & Related papers (2020-08-25T01:22:52Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.