FFCA-Net: Stereo Image Compression via Fast Cascade Alignment of Side
Information
- URL: http://arxiv.org/abs/2312.16963v2
- Date: Fri, 29 Dec 2023 08:51:34 GMT
- Title: FFCA-Net: Stereo Image Compression via Fast Cascade Alignment of Side
Information
- Authors: Yichong Xia, Yujun Huang, Bin Chen, Haoqian Wang, Yaowei Wang
- Abstract summary: Multi-view compression technology, especially Stereo Image Compression (SIC), plays a crucial role in car-mounted cameras and 3D-related applications.
We propose a Feature-based Fast Cascade Alignment network (FFCA-Net) to fully leverage the side information on the decoder.
Our approach achieves significant gains in terms of 3 to 10-fold faster decoding speed than other methods.
- Score: 44.88123177525665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-view compression technology, especially Stereo Image Compression (SIC),
plays a crucial role in car-mounted cameras and 3D-related applications.
Interestingly, the Distributed Source Coding (DSC) theory suggests that
efficient data compression of correlated sources can be achieved through
independent encoding and joint decoding. This motivates the rapidly developed
deep-distributed SIC methods in recent years. However, these approaches neglect
the unique characteristics of stereo-imaging tasks and incur high decoding
latency. To address this limitation, we propose a Feature-based Fast Cascade
Alignment network (FFCA-Net) to fully leverage the side information on the
decoder. FFCA adopts a coarse-to-fine cascaded alignment approach. In the
initial stage, FFCA utilizes a feature domain patch-matching module based on
stereo priors. This module reduces redundancy in the search space of trivial
matching methods and further mitigates the introduction of noise. In the
subsequent stage, we utilize an hourglass-based sparse stereo refinement
network to further align inter-image features with a reduced computational
cost. Furthermore, we have devised a lightweight yet high-performance feature
fusion network, called a Fast Feature Fusion network (FFF), to decode the
aligned features. Experimental results on InStereo2K, KITTI, and Cityscapes
datasets demonstrate the significant superiority of our approach over
traditional and learning-based SIC methods. In particular, our approach
achieves significant gains in terms of 3 to 10-fold faster decoding speed than
other methods.
Related papers
- Att2CPC: Attention-Guided Lossy Attribute Compression of Point Clouds [18.244200436103156]
We propose an efficient attention-based method for lossy compression of point cloud attributes leveraging on an autoencoder architecture.
Experiments show that our method achieves an average improvement of 1.15 dB and 2.13 dB in BD-PSNR of Y channel and YUV channel, respectively.
arXiv Detail & Related papers (2024-10-23T12:32:21Z) - Spatiotemporal Attention-based Semantic Compression for Real-time Video
Recognition [117.98023585449808]
We propose a temporal attention-based autoencoder (STAE) architecture to evaluate the importance of frames and pixels in each frame.
We develop a lightweight decoder that leverages a 3D-2D CNN combined to reconstruct missing information.
Experimental results show that ViT_STAE can compress the video dataset H51 by 104x with only 5% accuracy loss.
arXiv Detail & Related papers (2023-05-22T07:47:27Z) - Asymmetric Learned Image Compression with Multi-Scale Residual Block,
Importance Map, and Post-Quantization Filtering [15.056672221375104]
Deep learning-based image compression has achieved better ratedistortion (R-D) performance than the latest traditional method, H.266/VVC.
Many leading learned schemes cannot maintain a good trade-off between performance and complexity.
We propose an effcient and effective image coding framework, which achieves similar R-D performance with lower complexity than the state of the art.
arXiv Detail & Related papers (2022-06-21T09:34:29Z) - Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks.
We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation.
We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z) - A Coding Framework and Benchmark towards Low-Bitrate Video Understanding [63.05385140193666]
We propose a traditional-neural mixed coding framework that takes advantage of both traditional codecs and neural networks (NNs)
The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved.
We build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach.
arXiv Detail & Related papers (2022-02-06T16:29:15Z) - Efficient Data Compression for 3D Sparse TPC via Bicephalous
Convolutional Autoencoder [8.759778406741276]
This work introduces a dual-head autoencoder to resolve sparsity and regression simultaneously, called textitBicephalous Convolutional AutoEncoder (BCAE)
It shows advantages both in compression fidelity and ratio compared to traditional data compression methods, such as MGARD, SZ, and ZFP.
arXiv Detail & Related papers (2021-11-09T21:26:37Z) - Towards Deep and Efficient: A Deep Siamese Self-Attention Fully
Efficient Convolutional Network for Change Detection in VHR Images [28.36808011351123]
We present a very deep and efficient CD network, entitled EffCDNet.
In EffCDNet, an efficient convolution consisting of depth-wise convolution and group convolution with a channel shuffle mechanism is introduced.
On two challenging CD datasets, our approach outperforms other SOTA FCN-based methods.
arXiv Detail & Related papers (2021-08-18T14:02:38Z) - Neural Distributed Source Coding [59.630059301226474]
We present a framework for lossy DSC that is agnostic to the correlation structure and can scale to high dimensions.
We evaluate our method on multiple datasets and show that our method can handle complex correlations and state-of-the-art PSNR.
arXiv Detail & Related papers (2021-06-05T04:50:43Z) - FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks.
Current networks often occupy large number of parameters and require heavy computation costs.
Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z) - Sequential vessel segmentation via deep channel attention network [5.941874421818899]
This paper develops a novel encoder-decoder deep network architecture.
It exploits the several contextual frames of 2D+t sequential images in a sliding window centered at current frame to segment 2D vessel masks from the current frame.
The architecture is equipped with temporal-spatial feature extraction in encoder stage, feature fusion in skip connection layers and channel attention mechanism in decoder stage.
arXiv Detail & Related papers (2021-02-10T02:45:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.