Point Cloud Geometry Scalable Coding Using a Resolution and Quality-conditioned Latents Probability Estimator
- URL: http://arxiv.org/abs/2502.14099v1
- Date: Wed, 19 Feb 2025 20:58:53 GMT
- Title: Point Cloud Geometry Scalable Coding Using a Resolution and Quality-conditioned Latents Probability Estimator
- Authors: Daniele Mari, André F. R. Guarda, Nuno M. M. Rodrigues, Simone Milani, Fernando Pereira,
- Abstract summary: This paper focuses on the development of scalable coding solutions for deep learning-based Point Cloud (PC) coding.
The peculiarities of this 3D representation make it hard to implement flexible solutions that do not compromise the other functionalities of the software.
- Score: 47.792286013837945
- License:
- Abstract: In the current age, users consume multimedia content in very heterogeneous scenarios in terms of network, hardware, and display capabilities. A naive solution to this problem is to encode multiple independent streams, each covering a different possible requirement for the clients, with an obvious negative impact in both storage and computational requirements. These drawbacks can be avoided by using codecs that enable scalability, i.e., the ability to generate a progressive bitstream, containing a base layer followed by multiple enhancement layers, that allow decoding the same bitstream serving multiple reconstructions and visualization specifications. While scalable coding is a well-known and addressed feature in conventional image and video codecs, this paper focuses on a new and very different problem, notably the development of scalable coding solutions for deep learning-based Point Cloud (PC) coding. The peculiarities of this 3D representation make it hard to implement flexible solutions that do not compromise the other functionalities of the codec. This paper proposes a joint quality and resolution scalability scheme, named Scalable Resolution and Quality Hyperprior (SRQH), that, contrary to previous solutions, can model the relationship between latents obtained with models trained for different RD tradeoffs and/or at different resolutions. Experimental results obtained by integrating SRQH in the emerging JPEG Pleno learning-based PC coding standard show that SRQH allows decoding the PC at different qualities and resolutions with a single bitstream while incurring only in a limited RD penalty and increment in complexity w.r.t. non-scalable JPEG PCC that would require one bitstream per coding configuration.
Related papers
- DeepFGS: Fine-Grained Scalable Coding for Learned Image Compression [27.834491128701963]
This paper proposes a learned fine-grained scalable image compression framework, namely DeepFGS.
For entropy coding, we design a mutual entropy model to fully explore the correlation between the basic and scalable features.
Experiments demonstrate that our proposed DeepFGS outperforms previous learning-based scalable image compression models.
arXiv Detail & Related papers (2024-11-30T11:19:38Z) - When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding [118.72266141321647]
Cross-Modality Video Coding (CMVC) is a pioneering approach to explore multimodality representation and video generative models in video coding.
During decoding, previously encoded components and video generation models are leveraged to create multiple encoding-decoding modes.
Experiments indicate that TT2V achieves effective semantic reconstruction, while IT2V exhibits competitive perceptual consistency.
arXiv Detail & Related papers (2024-08-15T11:36:18Z) - HPC: Hierarchical Progressive Coding Framework for Volumetric Video [39.403294185116]
Volumetric video based on Neural Radiance Field (NeRF) holds vast potential for various 3D applications.
Current NeRF compression lacks the flexibility to adjust video quality and within a single model for various network and device capacities.
We propose HPC, a novel hierarchical progressive video coding framework achieving variable using a single model.
arXiv Detail & Related papers (2024-07-12T06:34:24Z) - Standard compliant video coding using low complexity, switchable neural wrappers [8.149130379436759]
We propose a new framework featuring standard compatibility, high performance, and low decoding complexity.
We employ a set of jointly optimized neural pre- and post-processors, wrapping a standard video, to encode videos at different resolutions.
We design a low complexity neural post-processor architecture that can handle different upsampling ratios.
arXiv Detail & Related papers (2024-07-10T06:36:45Z) - Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - Point Cloud Geometry Scalable Coding with a Quality-Conditioned Latents Probability Estimator [47.792286013837945]
Quality scalability is a major requirement in most learning-based PC coding solutions.
This paper proposes a quality scalability scheme, named Scalable Quality Hyperprior (SQH), adaptable to learning-based static point cloud geometry codecs.
SQH offers the quality scalability feature with very limited or no compression performance penalty at all when compared with the corresponding non-scalable solution.
arXiv Detail & Related papers (2024-04-11T12:44:15Z) - Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG
Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends.
Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z) - CBANet: Towards Complexity and Bitrate Adaptive Deep Image Compression
using a Single Network [24.418215098116335]
We propose a new deep image compression framework called Complexity and Bitrate Adaptive Network (CBANet)
Our CBANet considers the trade-off between the rate and distortion under dynamic computational complexity constraints.
As a result, our CBANet enables one single to support multiple decoding under various computational complexity constraints.
arXiv Detail & Related papers (2021-05-26T08:13:56Z) - Modeling Lost Information in Lossy Image Compression [72.69327382643549]
Lossy image compression is one of the most commonly used operators for digital images.
We propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem.
arXiv Detail & Related papers (2020-06-22T04:04:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.