Related papers: Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration

Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration

URL: http://arxiv.org/abs/2405.17146v1
Date: Mon, 27 May 2024 13:09:23 GMT
Title: Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration
Authors: Juan C. Pérez, Alejandro Pardo, Mattia Soldan, Hani Itani, Juan Leon-Alcazar, Bernard Ghanem,
Abstract summary: We focus on the JPEG format as a representative CFF, given its commonality and its representativeness of key concepts in compression. We test if CLMs understand the JPEG format by probing their capabilities to perform along three axes: recognition of inherent file properties, handling of files with anomalies, and generation of new files. Results suggest that CLMs can understand the semantics of compressed data when directly operating on the byte streams of files produced by CFFs.
Score: 82.88166538896331
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This study investigates whether Compressed-Language Models (CLMs), i.e. language models operating on raw byte streams from Compressed File Formats~(CFFs), can understand files compressed by CFFs. We focus on the JPEG format as a representative CFF, given its commonality and its representativeness of key concepts in compression, such as entropy coding and run-length encoding. We test if CLMs understand the JPEG format by probing their capabilities to perform along three axes: recognition of inherent file properties, handling of files with anomalies, and generation of new files. Our findings demonstrate that CLMs can effectively perform these tasks. These results suggest that CLMs can understand the semantics of compressed data when directly operating on the byte streams of files produced by CFFs. The possibility to directly operate on raw compressed files offers the promise to leverage some of their remarkable characteristics, such as their ubiquity, compactness, multi-modality and segment-nature.

Related papers

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations [51.097213824684665]
Discretization represents continuous data like images and videos as discrete tokens. Common methods of discretizing images and videos include modeling raw pixel values. We show that using canonical representations can help lower the barriers between language generation and visual generation.
arXiv Detail & Related papers (2024-08-15T23:57:02Z)
CompTLL-UNet: Compressed Domain Text-Line Localization in Challenging Handwritten Documents using Deep Feature Learning from JPEG Coefficients [0.9405458160620535]
We propose an idea that employs deep feature learning directly from the JPEG compressed coefficients without full decompression to accomplish text-line localization in the JPEG compressed domain. A modified U-Net architecture known as Compressed Text-Line localization Network (CompTLL-UNet) is designed to accomplish it. The model is trained and tested with JPEG compressed version of benchmark datasets including ICDAR 2017 (cBAD) and ICDAR 2019 (cBAD)
arXiv Detail & Related papers (2023-08-11T14:02:52Z)
Learned Lossless Compression for JPEG via Frequency-Domain Prediction [50.20577108662153]
We propose a novel framework for learned lossless compression of JPEG images. To enable learning in the frequency domain, DCT coefficients are partitioned into groups to utilize implicit local redundancy. An autoencoder-like architecture is designed based on the weight-shared blocks to realize entropy modeling of grouped DCT coefficients.
arXiv Detail & Related papers (2023-03-05T13:15:28Z)
Data Efficient Visual Place Recognition Using Extremely JPEG-Compressed Images [17.847661026367767]
This paper studies the effects of JPEG compression on the performance of Visual Place Recognition techniques. We show that by introducing compression, the VPR performance is drastically reduced, especially in the higher spectrum of compression. We present a fine-tuned CNN which is optimized for JPEG compressed data and show that it performs more consistently with the image transformations detected in extremely compressed JPEG images.
arXiv Detail & Related papers (2022-09-17T14:46:28Z)
Practical Learned Lossless JPEG Recompression with Multi-Level Cross-Channel Entropy Model in the DCT Domain [10.655855413391324]
We propose a deep learning based JPEG recompression method that operates on DCT domain. Experiments show that our method achieves state-of-the-art performance compared with traditional JPEG recompression methods.
arXiv Detail & Related papers (2022-03-30T14:36:13Z)
Towards Robust Data Hiding Against (JPEG) Compression: A Pseudo-Differentiable Deep Learning Approach [78.05383266222285]
It is still an open challenge to achieve the goal of data hiding that can be against these compressions. Deep learning has shown large success in data hiding, while non-differentiability of JPEG makes it challenging to train a deep pipeline for improving robustness against lossy compression. In this work, we propose a simple yet effective approach to address all the above limitations at once.
arXiv Detail & Related papers (2020-12-30T12:30:09Z)
Learning to Improve Image Compression without Changing the Standard Decoder [100.32492297717056]
We propose learning to improve the encoding performance with the standard decoder. Specifically, a frequency-domain pre-editing method is proposed to optimize the distribution of DCT coefficients. We do not modify the JPEG decoder and therefore our approach is applicable when viewing images with the widely used standard JPEG decoder.
arXiv Detail & Related papers (2020-09-27T19:24:42Z)
Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency. Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images. Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.