Related papers: GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

URL: http://arxiv.org/abs/2403.08551v5
Date: Tue, 9 Jul 2024 15:48:32 GMT
Title: GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting
Authors: Xinjie Zhang, Xingtong Ge, Tongda Xu, Dailan He, Yan Wang, Hongwei Qin, Guo Lu, Jing Geng, Jun Zhang,
Abstract summary: Implicit neural representations (INRs) recently achieved great success in image representation and compression. However, this requirement often hinders their use on low-end devices with limited memory. We propose a groundbreaking paradigm of image representation and compression by 2D Gaussian Splatting, named GaussianImage.
Score: 27.33121386538575
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Implicit neural representations (INRs) recently achieved great success in image representation and compression, offering high visual quality and fast rendering speeds with 10-1000 FPS, assuming sufficient GPU resources are available. However, this requirement often hinders their use on low-end devices with limited memory. In response, we propose a groundbreaking paradigm of image representation and compression by 2D Gaussian Splatting, named GaussianImage. We first introduce 2D Gaussian to represent the image, where each Gaussian has 8 parameters including position, covariance and color. Subsequently, we unveil a novel rendering algorithm based on accumulated summation. Remarkably, our method with a minimum of 3$\times$ lower GPU memory usage and 5$\times$ faster fitting time not only rivals INRs (e.g., WIRE, I-NGP) in representation performance, but also delivers a faster rendering speed of 1500-2000 FPS regardless of parameter size. Furthermore, we integrate existing vector quantization technique to build an image codec. Experimental results demonstrate that our codec attains rate-distortion performance comparable to compression-based INRs such as COIN and COIN++, while facilitating decoding speeds of approximately 2000 FPS. Additionally, preliminary proof of concept shows that our codec surpasses COIN and COIN++ in performance when using partial bits-back coding. Code is available at https://github.com/Xinjie-Q/GaussianImage.

Related papers

ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization [59.481950697968706]
We propose Progressive Generative Image Compression (ProGIC), a compact built on residual vector quantization (RVQ)<n>In RVQ, a sequence of vector quantizers encodes the residuals stage by stage, each with its own codebook.<n>We pair this with a lightweight backbone based on depthwise-separable convolutions and small attention blocks, enabling practical deployment on both GPU and CPU-only devices.
arXiv Detail & Related papers (2026-03-03T11:47:05Z)
GaussianImage++: Boosted Image Representation and Compression with 2D Gaussian Splatting [29.353296699942117]
Implicit neural representations (INRs) have achieved remarkable success in image representation and compression.<n>Recent 2D Gaussian Splatting (GS) methods offer promising alternatives through efficient primitive-based rendering.<n>We present GaussianImage++, which utilizes limited Gaussian primitives to achieve impressive representation and compression performance.
arXiv Detail & Related papers (2025-12-22T07:22:39Z)
ExGS: Extreme 3D Gaussian Compression with Diffusion Priors [60.7245825868903]
We introduce ExGS and GaussPainter for Extreme 3DGS compression.<n>GassPainter fills in missing regions and enhances visible pixels, yielding substantial improvements in degraded renderings.<n>Our framework can even achieve over 100X compression (reducing a typical 354.77 MB model to about 3.31 MB)
arXiv Detail & Related papers (2025-09-29T13:23:06Z)
Compressing 3D Gaussian Splatting by Noise-Substituted Vector Quantization [14.71160140310766]
3D Gaussian Splatting (3DGS) has demonstrated remarkable effectiveness in 3D reconstruction, achieving high-quality results with real-time radiance field rendering. However, a key challenge is the substantial storage cost: reconstructing a single scene typically requires millions of Gaussian splats, each represented by 59 floating-point parameters, resulting in approximately 1 GB of memory. We propose a compression method by building separate attribute codebooks and storing only discrete code indices. Specifically, we employ noise-substituted vector quantization technique to jointly train the codebooks and model features, ensuring consistency between descent gradient optimization and parameter discretization
arXiv Detail & Related papers (2025-04-03T22:19:34Z)
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering [60.676919690136096]
We present textbf4DGS-1K, which runs at over 1000 FPS on modern scene GPUs. For Q1, we introduce the Spatial-Temporal Variation Score, a new pruning criterion that effectively removes short-lifespan Gaussians. For Q2, we store a mask for active Gaussians across consecutive frames, significantly reducing redundant computations in rendering.
arXiv Detail & Related papers (2025-03-20T17:59:44Z)
GaussianVideo: Efficient Video Representation and Compression by Gaussian Splatting [10.568851068989973]
Implicit Neural Representation for Videos (NeRV) has introduced a novel paradigm for video representation and compression. We propose a new video representation and method based on 2D Gaussian Splatting to efficiently handle data handle. Our method reduces memory usage by up to 78.4%, and significantly expedites video processing, achieving 5.5x faster training and 12.5x faster decoding.
arXiv Detail & Related papers (2025-03-06T11:31:08Z)
GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting [64.84383010238908]
We propose an effective image tokenizer with 2D Gaussian Splatting as a solution. In general, our framework integrates the local influence of 2D Gaussian distribution into the discrete space. Competitive reconstruction performances on CIFAR, Mini-Net, and ImageNet-1K demonstrate the effectiveness of our framework.
arXiv Detail & Related papers (2025-01-26T17:56:11Z)
REDUCIO! Generating 1024$\times$1024 Video within 16 Seconds using Extremely Compressed Motion Latents [110.41795676048835]
One crucial obstacle for large-scale applications is the expensive training and inference cost. In this paper, we argue that videos contain much more redundant information than images, thus can be encoded by very few motion latents. We train Reducio-DiT in around 3.2K training hours in total and generate a 16-frame 1024*1024 video clip within 15.5 seconds on a single A100 GPU.
arXiv Detail & Related papers (2024-11-20T18:59:52Z)
Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields [13.729716867839509]
We propose a learnable mask strategy that significantly reduces the number of Gaussians while preserving high performance. In addition, we propose a compact but effective representation of view-dependent color by employing a grid-based neural field. Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering.
arXiv Detail & Related papers (2024-08-07T14:56:34Z)
Image-GS: Content-Adaptive Image Representation via 2D Gaussians [55.15950594752051]
We propose Image-GS, a content-adaptive image representation. Using anisotropic 2D Gaussians as the basis, Image-GS shows high memory efficiency, supports fast random access, and offers a natural level of detail stack. General efficiency and fidelity of Image-GS are validated against several recent neural image representations and industry-standard texture compressors. We hope this research offers insights for developing new applications that require adaptive quality and resource control, such as machine perception, asset streaming, and content generation.
arXiv Detail & Related papers (2024-07-02T00:45:21Z)
Splatter Image: Ultra-Fast Single-View 3D Reconstruction [67.96212093828179]
Splatter Image is based on Gaussian Splatting, which allows fast and high-quality reconstruction of 3D scenes from multiple images. We learn a neural network that, at test time, performs reconstruction in a feed-forward manner, at 38 FPS. On several synthetic, real, multi-category and large-scale benchmark datasets, we achieve better results in terms of PSNR, LPIPS, and other metrics while training and evaluating much faster than prior works.
arXiv Detail & Related papers (2023-12-20T16:14:58Z)
LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS [55.85673901231235]
We introduce LightGaussian, a method for transforming 3D Gaussians into a more compact format. Inspired by Network Pruning, LightGaussian identifies Gaussians with minimal global significance on scene reconstruction. LightGaussian achieves an average 15x compression rate while boosting FPS from 144 to 237 within the 3D-GS framework.
arXiv Detail & Related papers (2023-11-28T21:39:20Z)
Compact 3D Gaussian Representation for Radiance Field [14.729871192785696]
We propose a learnable mask strategy to reduce the number of 3D Gaussian points without sacrificing performance. We also propose a compact but effective representation of view-dependent color by employing a grid-based neural field. Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering.
arXiv Detail & Related papers (2023-11-22T20:31:16Z)
Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis [0.552480439325792]
High-fidelity scene reconstruction with an optimized 3D Gaussian splat representation has been introduced for novel view synthesis from sparse image sets. We propose a compressed 3D Gaussian splat representation that utilizes sensitivity-aware vector clustering with quantization-aware training to compress directional colors and Gaussian parameters.
arXiv Detail & Related papers (2023-11-17T14:40:43Z)
CoordFill: Efficient High-Resolution Image Inpainting via Parameterized Coordinate Querying [52.91778151771145]
In this paper, we try to break the limitations for the first time thanks to the recent development of continuous implicit representation. Experiments show that the proposed method achieves real-time performance on the 2048$times$2048 images using a single GTX 2080 Ti GPU.
arXiv Detail & Related papers (2023-03-15T11:13:51Z)
PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural Framework [88.18310777246735]
We propose an end-to-end image compression framework that achieves 200 MB/s for both compression and decompression with a single NVIDIA Tesla V100 GPU. Experiments show that our framework compresses better than PNG by a margin of 30% in multiple datasets.
arXiv Detail & Related papers (2022-06-10T03:00:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.