OCR for TIFF Compressed Document Images Directly in Compressed Domain
Using Text segmentation and Hidden Markov Model
- URL: http://arxiv.org/abs/2209.09118v1
- Date: Tue, 13 Sep 2022 06:34:26 GMT
- Title: OCR for TIFF Compressed Document Images Directly in Compressed Domain
Using Text segmentation and Hidden Markov Model
- Authors: Dikshit Sharma and Mohammed Javed
- Abstract summary: We propose a novel idea of developing an OCR for CCITT (The International Telegraph and Telephone Consultative Committee) compressed machine printed TIFF document images directly in the compressed domain.
After segmenting text regions into lines and words, HMM is applied for recognition using three coding modes of CCITT- horizontal, vertical and the pass mode.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In today's technological era, document images play an important and integral
part in our day to day life, and specifically with the surge of Covid-19,
digitally scanned documents have become key source of communication, thus
avoiding any sort of infection through physical contact. Storage and
transmission of scanned document images is a very memory intensive task, hence
compression techniques are being used to reduce the image size before archival
and transmission. To extract information or to operate on the compressed
images, we have two ways of doing it. The first way is to decompress the image
and operate on it and subsequently compress it again for the efficiency of
storage and transmission. The other way is to use the characteristics of the
underlying compression algorithm to directly process the images in their
compressed form without involving decompression and re-compression. In this
paper, we propose a novel idea of developing an OCR for CCITT (The
International Telegraph and Telephone Consultative Committee) compressed
machine printed TIFF document images directly in the compressed domain. After
segmenting text regions into lines and words, HMM is applied for recognition
using three coding modes of CCITT- horizontal, vertical and the pass mode.
Experimental results show that OCR on pass modes give a promising results.
Related papers
- UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation [59.3877309501938]
Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios.
We introduce a codebook containing frequency domain information as a prior input to the INR network.
This enhances the representational power of INR and provides distinctive conditioning for different image blocks.
arXiv Detail & Related papers (2024-05-27T05:52:13Z) - MISC: Ultra-low Bitrate Image Semantic Compression Driven by Large Multimodal Model [78.4051835615796]
This paper proposes a method called Multimodal Image Semantic Compression.
It consists of an LMM encoder for extracting the semantic information of the image, a map encoder to locate the region corresponding to the semantic, an image encoder generates an extremely compressed bitstream, and a decoder reconstructs the image based on the above information.
It can achieve optimal consistency and perception results while saving perceptual 50%, which has strong potential applications in the next generation of storage and communication.
arXiv Detail & Related papers (2024-02-26T17:11:11Z) - Perceptual Image Compression with Cooperative Cross-Modal Side
Information [53.356714177243745]
We propose a novel deep image compression method with text-guided side information to achieve a better rate-perception-distortion tradeoff.
Specifically, we employ the CLIP text encoder and an effective Semantic-Spatial Aware block to fuse the text and image features.
arXiv Detail & Related papers (2023-11-23T08:31:11Z) - Deep Unrestricted Document Image Rectification [110.61517455253308]
We present DocTr++, a novel unified framework for document image rectification.
We upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing.
We contribute a real-world test set and metrics applicable for evaluating the rectification quality.
arXiv Detail & Related papers (2023-04-18T08:00:54Z) - Document Image Binarization in JPEG Compressed Domain using Dual
Discriminator Generative Adversarial Networks [0.0]
The proposed model has been thoroughly tested with different versions of DIBCO dataset having challenges like holes, erased or smudged ink, dust, and misplaced fibres.
The model proved to be highly robust, efficient both in terms of time and space complexities, and also resulted in state-of-the-art performance in JPEG compressed domain.
arXiv Detail & Related papers (2022-09-13T12:07:32Z) - Towards Robust Data Hiding Against (JPEG) Compression: A
Pseudo-Differentiable Deep Learning Approach [78.05383266222285]
It is still an open challenge to achieve the goal of data hiding that can be against these compressions.
Deep learning has shown large success in data hiding, while non-differentiability of JPEG makes it challenging to train a deep pipeline for improving robustness against lossy compression.
In this work, we propose a simple yet effective approach to address all the above limitations at once.
arXiv Detail & Related papers (2020-12-30T12:30:09Z) - Compressing Images by Encoding Their Latent Representations with
Relative Entropy Coding [5.687243501594734]
Variational Autoencoders (VAEs) have seen widespread use in learned image compression.
We propose a novel method, Relative Entropy Coding (REC), that can directly encode the latent representation with codelength close to the relative entropy for single images.
arXiv Detail & Related papers (2020-10-02T20:23:22Z) - What's in the Image? Explorable Decoding of Compressed Images [45.22726784749359]
We develop a novel decoder architecture for the ubiquitous JPEG standard, which allows traversing the set of decompressed images.
We exemplify our framework on graphical, medical and forensic use cases, demonstrating its wide range of potential applications.
arXiv Detail & Related papers (2020-06-16T17:15:44Z) - Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency.
Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images.
Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.