Related papers: Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis

Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis

URL: http://arxiv.org/abs/2503.12150v3
Date: Mon, 28 Apr 2025 02:58:27 GMT
Title: Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis
Authors: Hongyu Sun, Qiuhong Ke, Ming Cheng, Yongcai Wang, Deying Li, Chenhui Gou, Jianfei Cai,
Abstract summary: This paper proposes a general solution to enable point cloud recognition models to handle distribution shifts at test time.<n>We adapt the model solely based on online test data to recognize both previously seen classes and novel, unseen classes at test time.<n>Point-Cache demonstrates substantial gains across 8 challenging benchmarks and 4 representative large 3D models, highlighting its effectiveness.
Score: 36.9393931544028
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This paper proposes a general solution to enable point cloud recognition models to handle distribution shifts at test time. Unlike prior methods, which rely heavily on training data (often inaccessible during online inference) and are limited to recognizing a fixed set of point cloud classes predefined during training, we explore a more practical and challenging scenario: adapting the model solely based on online test data to recognize both previously seen classes and novel, unseen classes at test time. To this end, we develop \textbf{Point-Cache}, a hierarchical cache model that captures essential clues of online test samples, particularly focusing on the global structure of point clouds and their local-part details. Point-Cache, which serves as a rich 3D knowledge base, is dynamically managed to prioritize the inclusion of high-quality samples. Designed as a plug-and-play module, our method can be flexibly integrated into large multimodal 3D models to support open-vocabulary point cloud recognition. Notably, our solution operates with efficiency comparable to zero-shot inference, as it is entirely training-free. Point-Cache demonstrates substantial gains across 8 challenging benchmarks and 4 representative large 3D models, highlighting its effectiveness. Code is available at https://github.com/auniquesun/Point-Cache.

Related papers

UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting [64.31900521467362]
No existing pre-training method is equally effective for both object- and scene-level point clouds.<n>We introduce UniPre3D, the first unified pre-training method that can be seamlessly applied to point clouds of any scale and 3D models of any architecture.
arXiv Detail & Related papers (2025-06-11T17:23:21Z)
Generalized Robot 3D Vision-Language Model with Fast Rendering and Pre-Training Vision-Language Alignment [55.11291053011696]
This work presents a framework for dealing with 3D scene understanding when the labeled scenes are quite limited. To extract knowledge for novel categories from the pre-trained vision-language models, we propose a hierarchical feature-aligned pre-training and knowledge distillation strategy. In the limited reconstruction case, our proposed approach, termed WS3D++, ranks 1st on the large-scale ScanNet benchmark.
arXiv Detail & Related papers (2023-12-01T15:47:04Z)
Test-Time Adaptation for Point Cloud Upsampling Using Meta-Learning [17.980649681325406]
We propose a test-time adaption approach to enhance model generality of point cloud upsampling. The proposed approach leverages meta-learning to explicitly learn network parameters for test-time adaption. Our framework is generic and can be applied in a plug-and-play manner with existing backbone networks in point cloud upsampling.
arXiv Detail & Related papers (2023-08-31T06:44:59Z)
Clustering based Point Cloud Representation Learning for 3D Analysis [80.88995099442374]
We propose a clustering based supervised learning scheme for point cloud analysis. Unlike current de-facto, scene-wise training paradigm, our algorithm conducts within-class clustering on the point embedding space. Our algorithm shows notable improvements on famous point cloud segmentation datasets.
arXiv Detail & Related papers (2023-07-27T03:42:12Z)
Explore In-Context Learning for 3D Point Cloud Understanding [71.20912026561484]
We introduce a novel framework, named Point-In-Context, designed especially for in-context learning in 3D point clouds. We propose the Joint Sampling module, carefully designed to work in tandem with the general point sampling operator. We conduct extensive experiments to validate the versatility and adaptability of our proposed methods in handling a wide range of tasks.
arXiv Detail & Related papers (2023-06-14T17:53:21Z)
Variational Relational Point Completion Network for Robust 3D Classification [59.80993960827833]
Vari point cloud completion methods tend to generate global shape skeletons hence lack fine local details. This paper proposes a variational framework, point Completion Network (VRCNet) with two appealing properties. VRCNet shows great generalizability and robustness on real-world point cloud scans.
arXiv Detail & Related papers (2023-04-18T17:03:20Z)
Point2Vec for Self-Supervised Representation Learning on Point Clouds [66.53955515020053]
We extend data2vec to the point cloud domain and report encouraging results on several downstream tasks. We propose point2vec, which unleashes the full potential of data2vec-like pre-training on point clouds.
arXiv Detail & Related papers (2023-03-29T10:08:29Z)
EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder [60.52613206271329]
This paper introduces textbfEfficient textbfPoint textbfCloud textbfLearning (EPCL) for training high-quality point cloud models with a frozen CLIP transformer. Our EPCL connects the 2D and 3D modalities by semantically aligning the image features and point cloud features without paired 2D-3D data.
arXiv Detail & Related papers (2022-12-08T06:27:11Z)
What Stops Learning-based 3D Registration from Working in the Real World? [53.68326201131434]
This work identifies the sources of 3D point cloud registration failures, analyze the reasons behind them, and propose solutions. Ultimately, this translates to a best-practice 3D registration network (BPNet), constituting the first learning-based method able to handle previously-unseen objects in real-world data. Our model generalizes to real data without any fine-tuning, reaching an accuracy of up to 67% on point clouds of unseen objects obtained with a commercial sensor.
arXiv Detail & Related papers (2021-11-19T19:24:27Z)
PnP-3D: A Plug-and-Play for 3D Point Clouds [38.05362492645094]
We propose a plug-and-play module, -3D, to improve the effectiveness of existing networks in analyzing point cloud data. To thoroughly evaluate our approach, we conduct experiments on three standard point cloud analysis tasks. In addition to achieving state-of-the-art results, we present comprehensive studies to demonstrate our approach's advantages.
arXiv Detail & Related papers (2021-08-16T23:59:43Z)
Point Discriminative Learning for Unsupervised Representation Learning on 3D Point Clouds [54.31515001741987]
We propose a point discriminative learning method for unsupervised representation learning on 3D point clouds. We achieve this by imposing a novel point discrimination loss on the middle level and global level point features. Our method learns powerful representations and achieves new state-of-the-art performance.
arXiv Detail & Related papers (2021-08-04T15:11:48Z)
Learning Semantic Segmentation of Large-Scale Point Clouds with Random Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z)
Point Transformer for Shape Classification and Retrieval of 3D and ALS Roof PointClouds [3.3744638598036123]
This paper proposes a fully attentional model - em Point Transformer, for deriving a rich point cloud representation. The model's shape classification and retrieval performance are evaluated on a large-scale urban dataset - RoofN3D and a standard benchmark dataset ModelNet40. The proposed method outperforms other state-of-the-art models in the RoofN3D dataset, gives competitive results in the ModelNet40 benchmark, and showcases high robustness to various unseen point corruptions.
arXiv Detail & Related papers (2020-11-08T08:11:02Z)
Multi-Frame to Single-Frame: Knowledge Distillation for 3D Object Detection [36.238956089801825]
We use knowledge distillation to bridge the gap between a model trained on high-quality inputs at training time and another tested on low-quality inputs at inference time. First, we train an object detection model on dense point clouds, which are generated from multiple frames using extra information only available at training time. Then, we train the model's identical counterpart on sparse single-frame point clouds with consistency regularization on features from both models.
arXiv Detail & Related papers (2020-09-24T17:59:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.