Deep Learning Based Image Retrieval in the JPEG Compressed Domain
- URL: http://arxiv.org/abs/2107.03648v1
- Date: Thu, 8 Jul 2021 07:30:03 GMT
- Title: Deep Learning Based Image Retrieval in the JPEG Compressed Domain
- Authors: Shrikant Temburwar, Bulla Rajesh and Mohammed Javed
- Abstract summary: We propose a unified model for image retrieval which takes DCT coefficients as input and efficiently extracts global and local features directly in the JPEG compressed domain for accurate image retrieval.
Our proposed model performed similarly to the current DELG model which takes RGB features as an input with reference to mean average precision.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Content-based image retrieval (CBIR) systems on pixel domain use low-level
features, such as colour, texture and shape, to retrieve images. In this
context, two types of image representations i.e. local and global image
features have been studied in the literature. Extracting these features from
pixel images and comparing them with images from the database is very
time-consuming. Therefore, in recent years, there has been some effort to
accomplish image analysis directly in the compressed domain with lesser
computations. Furthermore, most of the images in our daily transactions are
stored in the JPEG compressed format. Therefore, it would be ideal if we could
retrieve features directly from the partially decoded or compressed data and
use them for retrieval. Here, we propose a unified model for image retrieval
which takes DCT coefficients as input and efficiently extracts global and local
features directly in the JPEG compressed domain for accurate image retrieval.
The experimental findings indicate that our proposed model performed similarly
to the current DELG model which takes RGB features as an input with reference
to mean average precision while having a faster training and retrieval speed.
Related papers
- Image-GS: Content-Adaptive Image Representation via 2D Gaussians [55.15950594752051]
We propose Image-GS, a content-adaptive image representation.
Using anisotropic 2D Gaussians as the basis, Image-GS shows high memory efficiency, supports fast random access, and offers a natural level of detail stack.
General efficiency and fidelity of Image-GS are validated against several recent neural image representations and industry-standard texture compressors.
We hope this research offers insights for developing new applications that require adaptive quality and resource control, such as machine perception, asset streaming, and content generation.
arXiv Detail & Related papers (2024-07-02T00:45:21Z) - Hyperspectral Image Compression Using Sampling and Implicit Neural
Representations [2.3931689873603603]
Hyperspectral images record the electromagnetic spectrum for a pixel in the image of a scene.
With the decreasing cost of capturing these images, there is a need to develop efficient techniques for storing, transmitting, and analyzing hyperspectral images.
This paper develops a method for hyperspectral image compression using implicit neural representations.
arXiv Detail & Related papers (2023-12-04T01:10:04Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - Raw Image Reconstruction with Learned Compact Metadata [61.62454853089346]
We propose a novel framework to learn a compact representation in the latent space serving as the metadata in an end-to-end manner.
We show how the proposed raw image compression scheme can adaptively allocate more bits to image regions that are important from a global perspective.
arXiv Detail & Related papers (2023-02-25T05:29:45Z) - T2CI-GAN: Text to Compressed Image generation using Generative
Adversarial Network [9.657133242509671]
In practice, most of the visual data are processed and transmitted in the compressed representation form.
The proposed work attempts to generate the visual data directly in the compressed representation form using Deep Convolutional GANs (DCGANs)
The first model is directly trained with JPEG compressed DCT images (compressed domain) to generate the compressed images from text descriptions.
The second model is trained with RGB images (pixel domain) to generate JPEG compressed DCT representation from text descriptions.
arXiv Detail & Related papers (2022-10-01T09:26:25Z) - Learning-based Compression for Material and Texture Recognition [23.668803886355683]
This paper is concerned with learning-based compression schemes whose compressed-domain representations can be utilized to perform visual processing and computer vision tasks directly in the compressed domain.
We adopt the learning-based JPEG-AI framework for performing material and texture recognition using the compressed-domain latent representation at varing bit-rates.
It is also shown that the compressed-domain classification can yield a competitive performance in terms of Top-1 and Top-5 accuracy while using a smaller reduced-complexity classification model.
arXiv Detail & Related papers (2021-04-16T23:16:26Z) - CNNs for JPEGs: A Study in Computational Cost [49.97673761305336]
Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade.
CNNs are capable of learning robust representations of the data directly from the RGB pixels.
Deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years.
arXiv Detail & Related papers (2020-12-26T15:00:10Z) - Using Text to Teach Image Retrieval [47.72498265721957]
We build on the concept of image manifold to represent the feature space of images, learned via neural networks, as a graph.
We augment the manifold samples with geometrically aligned text, thereby using a plethora of sentences to teach us about images.
The experimental results show that the joint embedding manifold is a robust representation, allowing it to be a better basis to perform image retrieval.
arXiv Detail & Related papers (2020-11-19T16:09:14Z) - Object Detection in the DCT Domain: is Luminance the Solution? [4.361526134899725]
This paper proposes to take advantage of the compressed representation of images to carry out object detection usable in constrained resources conditions.
This leads to a $times 1.7$ speed up in comparison with a standard RGB-based architecture, while only reducing the detection performance by 5.5%.
arXiv Detail & Related papers (2020-06-10T08:43:40Z) - Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency.
Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images.
Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z) - Image retrieval approach based on local texture information derived from
predefined patterns and spatial domain information [14.620086904601472]
The performance of the proposed method is evaluated in terms of precision and recall on the Simplicity database.
The comparative results showed that the proposed approach offers higher precision rate than many known methods.
arXiv Detail & Related papers (2019-12-30T16:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.