Related papers: FLaTEC: Frequency-Disentangled Latent Triplanes for Efficient Compression of LiDAR Point Clouds

FLaTEC: Frequency-Disentangled Latent Triplanes for Efficient Compression of LiDAR Point Clouds

URL: http://arxiv.org/abs/2511.20065v1
Date: Tue, 25 Nov 2025 08:37:49 GMT
Title: FLaTEC: Frequency-Disentangled Latent Triplanes for Efficient Compression of LiDAR Point Clouds
Authors: Xiaoge Zhang, Zijie Wu, Mingtao Feng, Zichen Geng, Mehwish Nasim, Saeed Anwar, Ajmal Mian,
Abstract summary: FLaTEC is a frequency-aware compression model that enables the compression of a full scan with high compression ratios.<n>We convert voxelized embeddings into triplane representations to reduce sparsity, computational cost, and storage requirements.<n>Our method achieves state-of-the-art rate-distortion performance and outperforms the standard codecs by 78% and 94% in BD-rate on both datasets.
Score: 52.997038111673966
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Point cloud compression methods jointly optimize bitrates and reconstruction distortion. However, balancing compression ratio and reconstruction quality is difficult because low-frequency and high-frequency components contribute differently at the same resolution. To address this, we propose FLaTEC, a frequency-aware compression model that enables the compression of a full scan with high compression ratios. Our approach introduces a frequency-aware mechanism that decouples low-frequency structures and high-frequency textures, while hybridizing latent triplanes as a compact proxy for point cloud. Specifically, we convert voxelized embeddings into triplane representations to reduce sparsity, computational cost, and storage requirements. We then devise a frequency-disentangling technique that extracts compact low-frequency content while collecting high-frequency details across scales. The decoupled low-frequency and high-frequency components are stored in binary format. During decoding, full-spectrum signals are progressively recovered via a modulation block. Additionally, to compensate for the loss of 3D correlation, we introduce an efficient frequency-based attention mechanism that fosters local connectivity and outputs arbitrary resolution points. Our method achieves state-of-the-art rate-distortion performance and outperforms the standard codecs by 78\% and 94\% in BD-rate on both SemanticKITTI and Ford datasets.

Related papers

S-PRESSO: Ultra Low Bitrate Sound Effect Compression With Diffusion Autoencoders And Offline Quantization [24.710418261668888]
We present S-PRESSO, a 48kHz sound effect compression model that produces both continuous and discrete embeddings at ultra-lows.<n>Our model relies on a pretrained latent diffusion model to decode compressed audio embeddings learned by a latent encoder.<n>We demonstrate that S-PRESSO outperforms both continuous and discrete baselines in audio quality, acoustic similarity and reconstruction metrics.
arXiv Detail & Related papers (2026-02-16T10:28:38Z)
KV-Latent: Dimensional-level KV Cache Reduction with Frequency-aware Rotary Positional Embedding [72.12756830560217]
Large language models (LLMs) based on Transformer Decoders have become the preferred choice for conversational generative AI.<n>Despite the overall superiority of the Decoder architecture, the gradually increasing Key-Value cache during inference has emerged as a primary efficiency bottleneck.<n>By down-sampling the Key-Value vector dimensions into a latent space, we can significantly reduce the KV Cache footprint and improve inference speed.
arXiv Detail & Related papers (2025-07-15T12:52:12Z)
Single-step Diffusion for Image Compression at Ultra-Low Bitrates [19.76457078979179]
We propose a single-step diffusion model for image compression that delivers high perceptual quality and fast decoding at ultra-lows.<n>Our approach incorporates two key innovations: (i) Vector-Quantized Residual (VQ-Residual) training, which factorizes a structural base code and a learned residual in latent space.<n>Ours achieves comparable compression performance to state-of-the-art methods while improving decoding speed by about 50x.
arXiv Detail & Related papers (2025-06-19T19:53:27Z)
Freqformer: Image-Demoiréing Transformer via Efficient Frequency Decomposition [83.40450475728792]
We present Freqformer, a Transformer-based framework specifically designed for image demoir'eing through targeted frequency separation.<n>Our method performs an effective frequency decomposition that explicitly splits moir'e patterns into high-frequency spatially-localized textures and low-frequency scale-robust color distortions.<n>Experiments on various demoir'eing benchmarks demonstrate that Freqformer achieves state-of-the-art performance with a compact model size.
arXiv Detail & Related papers (2025-05-25T12:23:10Z)
Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations.<n> Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations.<n> Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z)
High-Frequency Enhanced Hybrid Neural Representation for Video Compression [32.38933743785333]
This paper introduces a High-Frequency Enhanced Hybrid Neural Representation Network.<n>Our method focuses on leveraging high-frequency information to improve the synthesis of fine details by the network.<n> Experiments on the Bunny and UVG datasets demonstrate that our method outperforms other methods.
arXiv Detail & Related papers (2024-11-11T03:04:46Z)
Fast Feedforward 3D Gaussian Splatting Compression [55.149325473447384]
3D Gaussian Splatting (FCGS) is an optimization-free model that can compress 3DGS representations rapidly in a single feed-forward pass.<n>FCGS achieves a compression ratio of over 20X while maintaining fidelity, surpassing most per-scene SOTA optimization-based methods.
arXiv Detail & Related papers (2024-10-10T15:13:08Z)
High Frequency Matters: Uncertainty Guided Image Compression with Wavelet Diffusion [4.76749587454871]
We propose an efficient Uncertainty-Guided image compression approach with wavelet Diffusion (UGDiff)<n>Our approach focuses on high frequency compression via the wavelet transform, since high frequency components are crucial for reconstructing image details.<n> Comprehensive experiments on two benchmark datasets validate the effectiveness of UGDiff.
arXiv Detail & Related papers (2024-07-17T13:21:31Z)
Lossy Compression with Gaussian Diffusion [28.930398810600504]
We describe a novel lossy compression approach called DiffC which is based on unconditional diffusion generative models. We implement a proof of concept and find that it works surprisingly well despite the lack of an encoder transform. We show that a flow-based reconstruction achieves a 3 dB gain over ancestral sampling at highs.
arXiv Detail & Related papers (2022-06-17T16:46:31Z)
Inception Transformer [151.939077819196]
Inception Transformer, or iFormer, learns comprehensive features with both high- and low-frequency information in visual data. We benchmark the iFormer on a series of vision tasks, and showcase that it achieves impressive performance on image classification, COCO detection and ADE20K segmentation.
arXiv Detail & Related papers (2022-05-25T17:59:54Z)
LC-FDNet: Learned Lossless Image Compression with Frequency Decomposition Network [14.848279912686948]
Recent learning-based image compression methods do not consider the performance drop in the high-frequency region. We propose a new method that proceeds the encoding in a coarse-to-fine manner to separate and process low and high-frequency regions differently. Experiments show that the proposed method achieves state-of-the-art performance for benchmark high-resolution datasets.
arXiv Detail & Related papers (2021-12-13T04:49:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.