A Hierarchical Spatial Transformer for Massive Point Samples in
Continuous Space
- URL: http://arxiv.org/abs/2311.04434v1
- Date: Wed, 8 Nov 2023 02:54:19 GMT
- Title: A Hierarchical Spatial Transformer for Massive Point Samples in
Continuous Space
- Authors: Wenchong He, Zhe Jiang, Tingsong Xiao, Zelin Xu, Shigang Chen, Ronald
Fick, Miles Medina, Christine Angelini
- Abstract summary: Existing transformers are mostly designed for sequences (texts or time series), images or videos, and graphs.
This paper proposes a novel transformer model for massive (up to a million) point samples in continuous space.
- Score: 11.074768589778934
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Transformers are widely used deep learning architectures. Existing
transformers are mostly designed for sequences (texts or time series), images
or videos, and graphs. This paper proposes a novel transformer model for
massive (up to a million) point samples in continuous space. Such data are
ubiquitous in environment sciences (e.g., sensor observations), numerical
simulations (e.g., particle-laden flow, astrophysics), and location-based
services (e.g., POIs and trajectories). However, designing a transformer for
massive spatial points is non-trivial due to several challenges, including
implicit long-range and multi-scale dependency on irregular points in
continuous space, a non-uniform point distribution, the potential high
computational costs of calculating all-pair attention across massive points,
and the risks of over-confident predictions due to varying point density. To
address these challenges, we propose a new hierarchical spatial transformer
model, which includes multi-resolution representation learning within a
quad-tree hierarchy and efficient spatial attention via coarse approximation.
We also design an uncertainty quantification branch to estimate prediction
confidence related to input feature noise and point sparsity. We provide a
theoretical analysis of computational time complexity and memory costs.
Extensive experiments on both real-world and synthetic datasets show that our
method outperforms multiple baselines in prediction accuracy and our model can
scale up to one million points on one NVIDIA A100 GPU. The code is available at
\url{https://github.com/spatialdatasciencegroup/HST}.
Related papers
- Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics [11.182510067821745]
This study introduces a novel transformer model optimized for large-scale point cloud processing.
Our model integrates local inductive bias and achieves near-linear complexity with hardware-friendly regular operations.
Our findings highlight the superiority of using locality-sensitive hashing (LSH), especially OR & AND-construction LSH, in kernel approximation for large-scale point cloud data.
arXiv Detail & Related papers (2024-02-19T20:48:09Z) - SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks.
We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model.
We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Eagle: Large-Scale Learning of Turbulent Fluid Dynamics with Mesh
Transformers [23.589419066824306]
Estimating fluid dynamics is a notoriously hard problem to solve.
We introduce a new model, method and benchmark for the problem.
We show that our transformer outperforms state-of-the-art performance on, both, existing synthetic and real datasets.
arXiv Detail & Related papers (2023-02-16T12:59:08Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - Fast and realistic large-scale structure from machine-learning-augmented
random field simulations [0.0]
We train a machine learning model to transform projected lognormal dark matter density fields to more realistic dark matter maps.
We demonstrate the performance of our model comparing various statistical tests with different field resolutions, redshifts and cosmological parameters.
arXiv Detail & Related papers (2022-05-16T18:00:01Z) - Stratified Transformer for 3D Point Cloud Segmentation [89.9698499437732]
Stratified Transformer is able to capture long-range contexts and demonstrates strong generalization ability and high performance.
To combat the challenges posed by irregular point arrangements, we propose first-layer point embedding to aggregate local information.
Experiments demonstrate the effectiveness and superiority of our method on S3DIS, ScanNetv2 and ShapeNetPart datasets.
arXiv Detail & Related papers (2022-03-28T05:35:16Z) - PnP-DETR: Towards Efficient Visual Analysis with Transformers [146.55679348493587]
Recently, DETR pioneered the solution vision tasks with transformers, it directly translates the image feature map into the object result.
Recent transformer-based image recognition model andTT show consistent efficiency gain.
arXiv Detail & Related papers (2021-09-15T01:10:30Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - FFD: Fast Feature Detector [22.51804239092462]
We show that robust and accurate keypoints exist in the specific scale-space domain.
It is proved that setting the scale-space pyramid's smoothness ratio and blurring to 2 and 0.627, respectively, facilitates the detection of reliable keypoints.
arXiv Detail & Related papers (2020-12-01T21:56:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.