CRA-PCN: Point Cloud Completion with Intra- and Inter-level
Cross-Resolution Transformers
- URL: http://arxiv.org/abs/2401.01552v2
- Date: Wed, 14 Feb 2024 12:28:09 GMT
- Title: CRA-PCN: Point Cloud Completion with Intra- and Inter-level
Cross-Resolution Transformers
- Authors: Yi Rong, Haoran Zhou, Lixin Yuan, Cheng Mei, Jiahao Wang, Tong Lu
- Abstract summary: We present Cross-Resolution Transformer that efficiently performs cross-resolution aggregation with local attention mechanisms.
We integrate two forms of Cross-Resolution Transformers into one up-sampling block for point generation, and following the coarse-to-fine manner, we construct CRA-PCN to incrementally predict complete shapes.
- Score: 29.417270066061864
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Point cloud completion is an indispensable task for recovering complete point
clouds due to incompleteness caused by occlusion, limited sensor resolution,
etc. The family of coarse-to-fine generation architectures has recently
exhibited great success in point cloud completion and gradually became
mainstream. In this work, we unveil one of the key ingredients behind these
methods: meticulously devised feature extraction operations with explicit
cross-resolution aggregation. We present Cross-Resolution Transformer that
efficiently performs cross-resolution aggregation with local attention
mechanisms. With the help of our recursive designs, the proposed operation can
capture more scales of features than common aggregation operations, which is
beneficial for capturing fine geometric characteristics. While prior
methodologies have ventured into various manifestations of inter-level
cross-resolution aggregation, the effectiveness of intra-level one and their
combination has not been analyzed. With unified designs, Cross-Resolution
Transformer can perform intra- or inter-level cross-resolution aggregation by
switching inputs. We integrate two forms of Cross-Resolution Transformers into
one up-sampling block for point generation, and following the coarse-to-fine
manner, we construct CRA-PCN to incrementally predict complete shapes with
stacked up-sampling blocks. Extensive experiments demonstrate that our method
outperforms state-of-the-art methods by a large margin on several widely used
benchmarks. Codes are available at https://github.com/EasyRy/CRA-PCN.
Related papers
- Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based
Transformer Network for Remote Sensing Image Super-Resolution [13.894645293832044]
Transformer-based models have shown competitive performance in remote sensing image super-resolution (RSISR)
We propose a novel transformer architecture called Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network (SPIFFNet) for RSISR.
Our proposed model effectively enhances global cognition and understanding of the entire image, facilitating efficient integration of features cross-stages.
arXiv Detail & Related papers (2023-07-06T13:19:06Z) - BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling [60.257912103351394]
We develop a new point cloud upsampling pipeline called BIMS-PU.
We decompose the up/downsampling procedure into several up/downsampling sub-steps by breaking the target sampling factor into smaller factors.
We show that our method achieves superior results to state-of-the-art approaches.
arXiv Detail & Related papers (2022-06-25T13:13:37Z) - Multi-scale and Cross-scale Contrastive Learning for Semantic
Segmentation [5.281694565226513]
We apply contrastive learning to enhance the discriminative power of the multi-scale features extracted by semantic segmentation networks.
By first mapping the encoder's multi-scale representations to a common feature space, we instantiate a novel form of supervised local-global constraint.
arXiv Detail & Related papers (2022-03-25T01:24:24Z) - TransCMD: Cross-Modal Decoder Equipped with Transformer for RGB-D
Salient Object Detection [86.94578023985677]
In this work, we rethink this task from the perspective of global information alignment and transformation.
Specifically, the proposed method (TransCMD) cascades several cross-modal integration units to construct a top-down transformer-based information propagation path.
Experimental results on seven RGB-D SOD benchmark datasets demonstrate that a simple two-stream encoder-decoder framework can surpass the state-of-the-art purely CNN-based methods.
arXiv Detail & Related papers (2021-12-04T15:45:34Z) - PCAM: Product of Cross-Attention Matrices for Rigid Registration of
Point Clouds [79.99653758293277]
PCAM is a neural network whose key element is a pointwise product of cross-attention matrices.
We show that PCAM achieves state-of-the-art results among methods which, like us, solve steps (a) and (b) jointly via deepnets.
arXiv Detail & Related papers (2021-10-04T09:23:27Z) - PU-Flow: a Point Cloud Upsampling Networkwith Normalizing Flows [58.96306192736593]
We present PU-Flow, which incorporates normalizing flows and feature techniques to produce dense points uniformly distributed on the underlying surface.
Specifically, we formulate the upsampling process as point in a latent space, where the weights are adaptively learned from local geometric context.
We show that our method outperforms state-of-the-art deep learning-based approaches in terms of reconstruction quality, proximity-to-surface accuracy, and computation efficiency.
arXiv Detail & Related papers (2021-07-13T07:45:48Z) - Cross-Level Cross-Scale Cross-Attention Network for Point Cloud
Representation [8.76786786874107]
Self-attention mechanism recently achieves impressive advancement in Natural Language Processing (NLP) and Image Processing domains.
We propose an end-to-end architecture, dubbed Cross-Level Cross-Scale Cross-Attention Network (CLCSCANet) for point cloud representation learning.
arXiv Detail & Related papers (2021-04-27T09:01:14Z) - InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [80.39674800972182]
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network.
This plug-in loss term complements the cross-entropy loss in capturing boundary transformations.
We analyze the quantitative and qualitative effects of our loss function on three indoor and outdoor segmentation benchmarks.
arXiv Detail & Related papers (2021-04-06T18:52:45Z) - MuCAN: Multi-Correspondence Aggregation Network for Video
Super-Resolution [63.02785017714131]
Video super-resolution (VSR) aims to utilize multiple low-resolution frames to generate a high-resolution prediction for each frame.
Inter- and intra-frames are the key sources for exploiting temporal and spatial information.
We build an effective multi-correspondence aggregation network (MuCAN) for VSR.
arXiv Detail & Related papers (2020-07-23T05:41:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.