On Disentangled Training for Nonlinear Transform in Learned Image Compression
- URL: http://arxiv.org/abs/2501.13751v3
- Date: Sat, 15 Feb 2025 15:45:30 GMT
- Title: On Disentangled Training for Nonlinear Transform in Learned Image Compression
- Authors: Han Li, Shaohui Li, Wenrui Dai, Maida Cao, Nuowen Kan, Chenglin Li, Junni Zou, Hongkai Xiong,
- Abstract summary: Learned image compression (LIC) has demonstrated superior rate-distortion (R-D) performance compared to traditional codecs.
Existing LIC methods overlook the slow convergence caused by compacting energy in learning nonlinear transforms.
We propose a linear auxiliary transform (AuxT) to disentangle energy compaction in training nonlinear transforms.
- Score: 59.66885464492666
- License:
- Abstract: Learned image compression (LIC) has demonstrated superior rate-distortion (R-D) performance compared to traditional codecs, but is challenged by training inefficiency that could incur more than two weeks to train a state-of-the-art model from scratch. Existing LIC methods overlook the slow convergence caused by compacting energy in learning nonlinear transforms. In this paper, we first reveal that such energy compaction consists of two components, i.e., feature decorrelation and uneven energy modulation. On such basis, we propose a linear auxiliary transform (AuxT) to disentangle energy compaction in training nonlinear transforms. The proposed AuxT obtains coarse approximation to achieve efficient energy compaction such that distribution fitting with the nonlinear transforms can be simplified to fine details. We then develop wavelet-based linear shortcuts (WLSs) for AuxT that leverages wavelet-based downsampling and orthogonal linear projection for feature decorrelation and subband-aware scaling for
Related papers
- LiT: Delving into a Simplified Linear Diffusion Transformer for Image Generation [96.54620463472526]
Linear Diffusion Transformer (LiT) is an efficient text-to-image Transformer that can be deployed offline on a laptop.
LiT achieves highly competitive FID while reducing training steps by 80% and 77% compared to DiT.
For text-to-image generation, LiT allows for the rapid synthesis of up to 1K resolution photorealistic images.
arXiv Detail & Related papers (2025-01-22T16:02:06Z) - Unconventional Computing based on Four Wave Mixing in Highly Nonlinear
Waveguides [0.0]
We numerically analyze a photonic unconventional accelerator based on the four-wave mixing effect in highly nonlinear waveguides.
By exploiting the rich Kerr-induced nonlinearities, multiple nonlinear transformations of an input signal can be generated and used for solving complex nonlinear tasks.
arXiv Detail & Related papers (2024-02-14T12:34:38Z) - Semi-Supervised Coupled Thin-Plate Spline Model for Rotation Correction and Beyond [84.56978780892783]
We propose CoupledTPS, which iteratively couples multiple TPS with limited control points into a more flexible and powerful transformation.
In light of the laborious annotation cost, we develop a semi-supervised learning scheme to improve warping quality by exploiting unlabeled data.
Experiments demonstrate the superiority and universality of CoupledTPS over the existing state-of-the-art solutions for rotation correction.
arXiv Detail & Related papers (2024-01-24T13:03:28Z) - Gradient Descent Provably Solves Nonlinear Tomographic Reconstruction [60.95625458395291]
In computed tomography (CT) the forward model consists of a linear transform followed by an exponential nonlinearity based on the attenuation of light according to the Beer-Lambert Law.
We show that this approach reduces metal artifacts compared to a commercial reconstruction of a human skull with metal crowns.
arXiv Detail & Related papers (2023-10-06T00:47:57Z) - Tangent Transformers for Composition, Privacy and Removal [58.280295030852194]
Tangent Attention Fine-Tuning (TAFT) is a method for fine-tuning linearized transformers.
Tangent Attention Fine-Tuning (TAFT) is a method for fine-tuning linearized transformers.
arXiv Detail & Related papers (2023-07-16T18:31:25Z) - Application of Transformers for Nonlinear Channel Compensation in Optical Systems [0.23499129784547654]
We introduce a new nonlinear optical channel equalizer based on Transformers.
By leveraging parallel computation and attending directly to the memory across a sequence of symbols, we show that Transformers can be used effectively for nonlinear compensation.
arXiv Detail & Related papers (2023-04-25T19:48:54Z) - LLIC: Large Receptive Field Transform Coding with Adaptive Weights for Learned Image Compression [27.02281402358164]
We propose Large Receptive Field Transform Coding with Adaptive Weights for Learned Image Compression.
We introduce a few large kernelbased depth-wise convolutions to reduce more redundancy while maintaining modest complexity.
Our LLIC models achieve state-of-the-art performances and better trade-offs between performance and complexity.
arXiv Detail & Related papers (2023-04-19T11:19:10Z) - Accelerated MRI With Deep Linear Convolutional Transform Learning [7.927206441149002]
Recent studies show that deep learning based MRI reconstruction outperforms conventional methods in multiple applications.
In this work, we combine ideas from CS, TL and DL reconstructions to learn deep linear convolutional transforms.
Our results show that the proposed technique can reconstruct MR images to a level comparable to DL methods, while supporting uniform undersampling patterns.
arXiv Detail & Related papers (2022-04-17T04:47:32Z) - Nonlinear Transform Induced Tensor Nuclear Norm for Tensor Completion [12.788874164701785]
We propose a low-rank tensor completion (LRTC) model along the theoretical convergence of the NTTNN and the PAM algorithm.
Our method outperforms linear transform-based state-of-the-art nuclear norm (TNN) methods qualitatively and quantitatively.
arXiv Detail & Related papers (2021-10-17T09:25:37Z) - Hot-spots and gain enhancement in a doubly pumped parametric
down-conversion process [62.997667081978825]
We experimentally investigate the parametric down-conversion process in a nonlinear bulk crystal, driven by two non-collinear pump modes.
The experiment shows the emergence of bright hot-spots in modes shared by the two pumps, in analogy with the phenomenology recently observed in 2D nonlinear photonic crystals.
arXiv Detail & Related papers (2020-07-24T09:39:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.