The Bare Necessities: Designing Simple, Effective Open-Vocabulary Scene Graphs
- URL: http://arxiv.org/abs/2412.01539v1
- Date: Mon, 02 Dec 2024 14:29:31 GMT
- Title: The Bare Necessities: Designing Simple, Effective Open-Vocabulary Scene Graphs
- Authors: Christina Kassab, Matías Mattamala, Sacha Morin, Martin Büchner, Abhinav Valada, Liam Paull, Maurice Fallon,
- Abstract summary: 3D open-vocabulary scene graph methods are a promising map representation for embodied agents.
We propose a general scene graph framework and conduct three studies that focus on image pre-processing, feature fusion, and feature selection.
- Score: 16.600165443997433
- License:
- Abstract: 3D open-vocabulary scene graph methods are a promising map representation for embodied agents, however many current approaches are computationally expensive. In this paper, we reexamine the critical design choices established in previous works to optimize both efficiency and performance. We propose a general scene graph framework and conduct three studies that focus on image pre-processing, feature fusion, and feature selection. Our findings reveal that commonly used image pre-processing techniques provide minimal performance improvement while tripling computation (on a per object view basis). We also show that averaging feature labels across different views significantly degrades performance. We study alternative feature selection strategies that enhance performance without adding unnecessary computational costs. Based on our findings, we introduce a computationally balanced approach for 3D point cloud segmentation with per-object features. The approach matches state-of-the-art classification accuracy while achieving a threefold reduction in computation.
Related papers
- Visualizing Extensions of Argumentation Frameworks as Layered Graphs [15.793271603711014]
We introduce a new visualization technique that draws an AF, together with an extension, as a 3-layer graph layout.
Our technique supports the user to more easily explore the visualized AF, better understand extensions, and verify algorithms for computing semantics.
arXiv Detail & Related papers (2024-09-09T09:29:53Z) - Amplify Graph Learning for Recommendation via Sparsity Completion [16.32861024767423]
Graph learning models have been widely deployed in collaborative filtering (CF) based recommendation systems.
Due to the issue of data sparsity, the graph structure of the original input lacks potential positive preference edges.
We propose an Amplify Graph Learning framework based on Sparsity Completion (called AGL-SC)
arXiv Detail & Related papers (2024-06-27T08:26:20Z) - VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness [56.87603097348203]
VeCAF uses labels and natural language annotations to perform parametric data selection for PVM finetuning.
VeCAF incorporates the finetuning objective to select significant data points that effectively guide the PVM towards faster convergence.
On ImageNet, VeCAF uses up to 3.3x less training batches to reach the target performance compared to full finetuning.
arXiv Detail & Related papers (2024-01-15T17:28:37Z) - Large-scale Point Cloud Registration Based on Graph Matching
Optimization [30.92028761652611]
We propose a underlineGraph underlineMatching underlineOptimization based underlineNetwork.
The proposed method has been evaluated on the 3DMatch/3DLoMatch benchmarks and the KITTI benchmark.
arXiv Detail & Related papers (2023-02-12T03:29:35Z) - Learning Heuristics for the Maximum Clique Enumeration Problem Using Low
Dimensional Representations [0.0]
We use a learning framework for a pruning process of the input graph towards reducing the clique of the maximum enumeration problem.
We study the role of using different vertex representations on the performance of this runtime method.
We observe that using local graph features in the classification process produce more accurate results when combined with a feature elimination process.
arXiv Detail & Related papers (2022-10-30T22:04:32Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - GraphCoCo: Graph Complementary Contrastive Learning [65.89743197355722]
Graph Contrastive Learning (GCL) has shown promising performance in graph representation learning (GRL) without the supervision of manual annotations.
This paper proposes an effective graph complementary contrastive learning approach named GraphCoCo to tackle the above issue.
arXiv Detail & Related papers (2022-03-24T02:58:36Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Revisiting Point Cloud Simplification: A Learnable Feature Preserving
Approach [57.67932970472768]
Mesh and Point Cloud simplification methods aim to reduce the complexity of 3D models while retaining visual quality and relevant salient features.
We propose a fast point cloud simplification method by learning to sample salient points.
The proposed method relies on a graph neural network architecture trained to select an arbitrary, user-defined, number of points from the input space and to re-arrange their positions so as to minimize the visual perception error.
arXiv Detail & Related papers (2021-09-30T10:23:55Z) - Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning.
The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned.
Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.