Related papers: SANR: Scene-Aware Neural Representation for Light Field Image Compression with Rate-Distortion Optimization

SANR: Scene-Aware Neural Representation for Light Field Image Compression with Rate-Distortion Optimization

URL: http://arxiv.org/abs/2510.15775v1
Date: Fri, 17 Oct 2025 16:00:43 GMT
Title: SANR: Scene-Aware Neural Representation for Light Field Image Compression with Rate-Distortion Optimization
Authors: Gai Zhang, Xinfeng Zhang, Lv Tang, Hongyu An, Li Zhang, Qingming Huang,
Abstract summary: We propose a Scene-Aware Neural Representation framework for light field image compression with end-to-end rate-distortion optimization.<n>For scene awareness, SANR introduces a hierarchical scene modeling block that leverages multi-scale latent codes to capture intrinsic scene structures.<n>Experiment results demonstrate that SANR significantly outperforms state-of-the-art techniques regarding rate-distortion performance with a 65.62% BD-rate saving against HEVC.
Score: 54.184486302645716
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Light field images capture multi-view scene information and play a crucial role in 3D scene reconstruction. However, their high-dimensional nature results in enormous data volumes, posing a significant challenge for efficient compression in practical storage and transmission scenarios. Although neural representation-based methods have shown promise in light field image compression, most approaches rely on direct coordinate-to-pixel mapping through implicit neural representation (INR), often neglecting the explicit modeling of scene structure. Moreover, they typically lack end-to-end rate-distortion optimization, limiting their compression efficiency. To address these limitations, we propose SANR, a Scene-Aware Neural Representation framework for light field image compression with end-to-end rate-distortion optimization. For scene awareness, SANR introduces a hierarchical scene modeling block that leverages multi-scale latent codes to capture intrinsic scene structures, thereby reducing the information gap between INR input coordinates and the target light field image. From a compression perspective, SANR is the first to incorporate entropy-constrained quantization-aware training (QAT) into neural representation-based light field image compression, enabling end-to-end rate-distortion optimization. Extensive experiment results demonstrate that SANR significantly outperforms state-of-the-art techniques regarding rate-distortion performance with a 65.62\% BD-rate saving against HEVC.

Related papers

Deeply-Conditioned Image Compression via Self-Generated Priors [75.29511865838812]
We introduce a framework predicated on functional decomposition, which we term Deeply-Conditioned Image Compression via self-generated priors (DCIC-sgp)<n>Our framework achieves significant BD-rate reductions of 14.4%, 15.7%, and 15.1% against the VVC test model VTM-12.1 on the Kodak, CLIC, and Tecnick datasets.
arXiv Detail & Related papers (2025-10-28T14:04:19Z)
COLI: A Hierarchical Efficient Compressor for Large Images [18.697445453003983]
Implicit Neural Representations (INRs) present a promising alternative by learning continuous mappings from spatial coordinates to pixel intensities for individual images.<n>We introduce COLI (Compressor for Large Images), a novel framework leveraging Neural Representations for Videos (NeRV)<n>We show that COLI consistently achieves competitive or superior PSNR and SSIM metrics at significantly reduced bits per pixel (bpp) while accelerating NeRV training by up to 4 times.
arXiv Detail & Related papers (2025-07-15T16:07:07Z)
Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion [28.61304513668606]
ResULIC is a residual-guided ultra lowrate image compression system.<n>It incorporates residual signals into both semantic retrieval and the diffusion-based generation process.<n>It achieves superior objective and subjective performance compared to state-of-the-art diffusion-based methods.
arXiv Detail & Related papers (2025-05-13T06:51:23Z)
Range Image-Based Implicit Neural Compression for LiDAR Point Clouds [10.143205531474907]
We focus on 2D range images(RIs) as a lightweight format for representing 3D LiDAR observations.<n>We propose a novel implicit neural representation(INR)--based RI compression method that effectively handles floating-point valued pixels.<n> Experiments on the KITTI dataset show that the proposed method outperforms existing image, point cloud, RI, and INR-based compression methods in terms of 3D reconstruction and detection quality.
arXiv Detail & Related papers (2025-04-24T03:41:57Z)
Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations.<n> Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations.<n> Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z)
A Rate-Distortion-Classification Approach for Lossy Image Compression [0.0]
In lossy image compression, the objective is to achieve minimal signal distortion while compressing images to a specified bit rate. To bridge the gap between image compression and visual analysis, we propose a Rate-Distortion-Classification (RDC) model for lossy image compression.
arXiv Detail & Related papers (2024-05-06T14:11:36Z)
Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity. We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss. Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z)
ConvNeXt-ChARM: ConvNeXt-based Transform for Efficient Neural Image Compression [18.05997169440533]
We propose ConvNeXt-ChARM, an efficient ConvNeXt-based transform coding framework, paired with a compute-efficient channel-wise auto-regressive auto-regressive. We show that ConvNeXt-ChARM brings consistent and significant BD-rate (PSNR) reductions estimated on average to 5.24% and 1.22% over the versatile video coding (VVC) reference encoder (VTM-18.0) and the state-of-the-art learned image compression method SwinT-ChARM.
arXiv Detail & Related papers (2023-07-12T11:45:54Z)
Modality-Agnostic Variational Compression of Implicit Neural Representations [96.35492043867104]
We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR) Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism. After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression.
arXiv Detail & Related papers (2023-01-23T15:22:42Z)
Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types. We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding. We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.