Low-complexity Multidimensional DCT Approximations
- URL: http://arxiv.org/abs/2306.11724v1
- Date: Tue, 20 Jun 2023 17:55:48 GMT
- Title: Low-complexity Multidimensional DCT Approximations
- Authors: V. A. Coutinho, R. J. Cintra, F. M. Bayer
- Abstract summary: Several multiplierless $8times 8times 8$ approximate methods are proposed and the computational complexity is discussed for the general multidimensional case.
The proposed approximations were embedded into 3D DCT-based video coding scheme and a modified quantization step was introduced.
The simulation results showed that the approximate 3D DCT coding methods offer almost identical output visual quality when compared with exact 3D DCT scheme.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce low-complexity multidimensional discrete cosine
transform (DCT) approximations. Three dimensional DCT (3D DCT) approximations
are formalized in terms of high-order tensor theory. The formulation is
extended to higher dimensions with arbitrary lengths. Several multiplierless
$8\times 8\times 8$ approximate methods are proposed and the computational
complexity is discussed for the general multidimensional case. The proposed
methods complexity cost was assessed, presenting considerably lower arithmetic
operations when compared with the exact 3D DCT. The proposed approximations
were embedded into 3D DCT-based video coding scheme and a modified quantization
step was introduced. The simulation results showed that the approximate 3D DCT
coding methods offer almost identical output visual quality when compared with
exact 3D DCT scheme. The proposed 3D approximations were also employed as a
tool for visual tracking. The approximate 3D DCT-based proposed system performs
similarly to the original exact 3D DCT-based method. In general, the suggested
methods showed competitive performance at a considerably lower computational
cost.
Related papers
- Extensions on low-complexity DCT approximations for larger blocklengths based on minimal angle similarity [0.0]
The discrete cosine transform (DCT) is a central tool for image and video coding because it can be related to the Karhunen-Loeve transform (KLT)
We introduce 16-, 32-, and 64-point low-complexity DCT approximations by minimizing individually the angle between the rows of the exact DCT matrix and the matrix induced by the approximate transforms.
Fast algorithms were also developed for the low-complexity transforms, asserting a good balance between the performance and its computational cost.
arXiv Detail & Related papers (2024-10-20T01:20:35Z) - 3D Photon Counting CT Image Super-Resolution Using Conditional Diffusion Model [6.75361442343724]
This study aims to improve photon counting CT (PCCT) image resolution using denoising diffusion probabilistic models (DDPM)
We first leverage CatSim to simulate realistic lower resolution PCCT images from high-resolution CT scans.
Since maximizing DDPM performance is time-consuming for both inference and training, we explore both 2D and 3D networks for conditional DDPM.
arXiv Detail & Related papers (2024-08-22T02:25:21Z) - MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation [54.27399121779011]
We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images.
We show that our approach can yield more accurate synthesis compared to recent state-of-the-art, including distillation-based 3D inference and prior multi-view generation methods.
arXiv Detail & Related papers (2024-04-04T17:59:57Z) - Integer Optimization of CT Trajectories using a Discrete Data
Completeness Formulation [3.924235219960689]
X-ray computed tomography plays a key role in digitizing three-dimensional structures for a wide range of medical and industrial applications.
Traditional CT systems often rely on standard circular and helical scan trajectories, which may not be optimal for challenging scenarios involving large objects, complex structures, or resource constraints.
We are exploring the potential of twin robotic CT systems, which offer the flexibility to acquire projections from arbitrary views around the object of interest.
arXiv Detail & Related papers (2024-01-29T10:38:58Z) - Low-Complexity Loeffler DCT Approximations for Image and Video Coding [0.0]
This paper introduces a matrix parametrization method based on the Loeffler discrete cosine transform (DCT) algorithm.
A new class of eight-point DCT approximations was proposed, capable of unifying the mathematical formalism of several eight-point DCT approximations.
arXiv Detail & Related papers (2022-07-29T03:56:18Z) - Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D
Shape Synthesis [90.26556260531707]
DMTet is a conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels.
Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology.
arXiv Detail & Related papers (2021-11-08T05:29:35Z) - Asymmetric 3D Context Fusion for Universal Lesion Detection [55.61873234187917]
3D networks are strong in 3D context yet lack supervised pretraining.
Existing 3D context fusion operators are designed to be spatially symmetric, performing identical operations on each 2D slice like convolutions.
We propose a novel asymmetric 3D context fusion operator (A3D), which uses different weights to fuse 3D context from different 2D slices.
arXiv Detail & Related papers (2021-09-17T16:25:10Z) - Improving 3D Object Detection with Channel-wise Transformer [58.668922561622466]
We propose a two-stage 3D object detection framework (CT3D) with minimal hand-crafted design.
CT3D simultaneously performs proposal-aware embedding and channel-wise context aggregation.
It achieves the AP of 81.77% in the moderate car category on the KITTI test 3D detection benchmark.
arXiv Detail & Related papers (2021-08-23T02:03:40Z) - Revisiting 3D Context Modeling with Supervised Pre-training for
Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices.
With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset.
The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.