Related papers: Variable-size Symmetry-based Graph Fourier Transforms for image compression

Variable-size Symmetry-based Graph Fourier Transforms for image compression

URL: http://arxiv.org/abs/2411.15824v1
Date: Sun, 24 Nov 2024 13:00:44 GMT
Title: Variable-size Symmetry-based Graph Fourier Transforms for image compression
Authors: Alessandro Gnutti, Fabrizio Guerrini, Riccardo Leonardi, Antonio Ortega,
Abstract summary: We propose a new family of Symmetry-based Graph Fourier Transforms of variable sizes into a coding framework. Our proposed algorithm generates symmetric graphs on the grid by adding specific symmetrical connections between nodes. Experiments show that SBGFTs outperform the primary transforms integrated in the explicit Multiple Transform Selection.
Score: 65.7352685872625
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern compression systems use linear transformations in their encoding and decoding processes, with transforms providing compact signal representations. While multiple data-dependent transforms for image/video coding can adapt to diverse statistical characteristics, assembling large datasets to learn each transform is challenging. Also, the resulting transforms typically lack fast implementation, leading to significant computational costs. Thus, despite many papers proposing new transform families, the most recent compression standards predominantly use traditional separable sinusoidal transforms. This paper proposes integrating a new family of Symmetry-based Graph Fourier Transforms (SBGFTs) of variable sizes into a coding framework, focusing on the extension from our previously introduced 8x8 SBGFTs to the general case of NxN grids. SBGFTs are non-separable transforms that achieve sparse signal representation while maintaining low computational complexity thanks to their symmetry properties. Their design is based on our proposed algorithm, which generates symmetric graphs on the grid by adding specific symmetrical connections between nodes and does not require any data-dependent adaptation. Furthermore, for video intra-frame coding, we exploit the correlations between optimal graphs and prediction modes to reduce the cardinality of the transform sets, thus proposing a low-complexity framework. Experiments show that SBGFTs outperform the primary transforms integrated in the explicit Multiple Transform Selection (MTS) used in the latest VVC intra-coding, providing a bit rate saving percentage of 6.23%, with only a marginal increase in average complexity. A MATLAB implementation of the proposed algorithm is available online at [1].

Related papers

WUSH: Near-Optimal Adaptive Transforms for LLM Quantization [52.77441224845925]
Quantization to low bitwidth is a standard approach for deploying large language models.<n>A few extreme weights and activations stretch the dynamic range and reduce the effective resolution of the quantizer.<n>We derive, for the first time, closed-form optimal linear blockwise transforms for joint weight-activation quantization.
arXiv Detail & Related papers (2025-11-30T16:17:34Z)
Joint Multi-scale Gated Transformer and Prior-guided Convolutional Network for Learned Image Compression [10.916417411466846]
We propose a novel prior-guided convolution (PGConv) to improve the ability of the vanilla convolution to extract local features.<n>We also propose a novel multi-scale gated transformer (MGT) to improve the ability of the Swin-T block to extract non-local features.<n>Our results show that our MGTPCN surpasses state-of-the-art algorithms with a better trade-off between performance and complexity.
arXiv Detail & Related papers (2025-11-30T05:45:47Z)
3DGS Compression with Sparsity-guided Hierarchical Transform Coding [19.575833741231953]
Sparsity-guided Hierarchical Transform Coding (SHTC) is first end-to-end optimized transform coding framework for 3DGS compression.<n>SHTC jointly optimize the 3DGS, transforms and a lightweight context model.<n>This novel design significantly improves R-D performance with minimal additional parameters and computational overhead.
arXiv Detail & Related papers (2025-05-28T22:17:24Z)
Fast Data-independent KLT Approximations Based on Integer Functions [0.0]
The Karhunen-Loeve transform (KLT) stands as a well-established discrete transform, demonstrating optimal characteristics in data decorrelation and dimensionality reduction. This paper introduces a category of low-complexity, data-independent KLT approximations, employing a range of round-off functions. The proposed transforms perform well when compared to the exact KLT and approximations considering classical performance measures.
arXiv Detail & Related papers (2024-10-11T20:05:05Z)
Interpretable Lightweight Transformer via Unrolling of Learned Graph Smoothness Priors [16.04850782310842]
We build interpretable and lightweight transformer-like neural networks by unrolling iterative optimization algorithms. A normalized signal-dependent graph learning module amounts to a variant of the basic self-attention mechanism in conventional transformers.
arXiv Detail & Related papers (2024-06-06T14:01:28Z)
Entropy Transformer Networks: A Learning Approach via Tangent Bundle Data Manifold [8.893886200299228]
This paper focuses on an accurate and fast approach for image transformation employed in the design of CNN architectures. A novel Entropy STN (ESTN) is proposed that interpolates on the data manifold distributions. Experiments on challenging benchmarks show that the proposed ESTN can improve predictive accuracy over a range of computer vision tasks.
arXiv Detail & Related papers (2023-07-24T04:21:51Z)
B-cos Alignment for Inherently Interpretable CNNs and Vision Transformers [97.75725574963197]
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training. We show that a sequence of such transformations induces a single linear transformation that faithfully summarises the full model computations. We show that the resulting explanations are of high visual quality and perform well under quantitative interpretability metrics.
arXiv Detail & Related papers (2023-06-19T12:54:28Z)
Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks. Group-equivariant convolutions are a popular approach to obtain equivariant representations. We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z)
B-cos Networks: Alignment is All We Need for Interpretability [136.27303006772294]
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training. A B-cos transform induces a single linear transform that faithfully summarises the full model computations. We show that it can easily be integrated into common models such as VGGs, ResNets, InceptionNets, and DenseNets.
arXiv Detail & Related papers (2022-05-20T16:03:29Z)
Hybrid Model-based / Data-driven Graph Transform for Image Coding [54.31406300524195]
We present a hybrid model-based / data-driven approach to encode an intra-prediction residual block. The first $K$ eigenvectors of a transform matrix are derived from a statistical model, e.g., the asymmetric discrete sine transform (ADST) for stability. Using WebP as a baseline image, experimental results show that our hybrid graph transform achieved better energy compaction than default discrete cosine transform (DCT) and better stability than KLT.
arXiv Detail & Related papers (2022-03-02T15:36:44Z)
Improving the Sample-Complexity of Deep Classification Networks with Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks. We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems. We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z)
FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning [64.32306537419498]
We propose a novel learned feature-based refinement and augmentation method that produces a varied set of complex transformations. These transformations also use information from both within-class and across-class representations that we extract through clustering. We demonstrate that our method is comparable to current state of art for smaller datasets while being able to scale up to larger datasets.
arXiv Detail & Related papers (2020-07-16T17:55:31Z)
A Multiparametric Class of Low-complexity Transforms for Image and Video Coding [0.0]
We introduce a new class of low-complexity 8-point DCT approximations based on a series of works published by Bouguezel, Ahmed and Swamy. We show that the optimal DCT approximations present compelling results in terms of coding efficiency and image quality metrics.
arXiv Detail & Related papers (2020-06-19T21:56:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.