AutoGPart: Intermediate Supervision Search for Generalizable 3D Part
Segmentation
- URL: http://arxiv.org/abs/2203.06558v1
- Date: Sun, 13 Mar 2022 03:45:58 GMT
- Title: AutoGPart: Intermediate Supervision Search for Generalizable 3D Part
Segmentation
- Authors: Xueyi Liu, Xiaomeng Xu, Anyi Rao, Chuang Gan, Li Yi
- Abstract summary: AutoGPart builds a supervision space with geometric prior knowledge encoded, and lets the machine to search for the optimal supervisions for a specific segmentation task automatically.
We demonstrate that the performance of segmentation networks using simple backbones can be significantly improved when trained with supervisions searched by our method.
- Score: 58.78094823473567
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training a generalizable 3D part segmentation network is quite challenging
but of great importance in real-world applications. To tackle this problem,
some works design task-specific solutions by translating human understanding of
the task to machine's learning process, which faces the risk of missing the
optimal strategy since machines do not necessarily understand in the exact
human way. Others try to use conventional task-agnostic approaches designed for
domain generalization problems with no task prior knowledge considered. To
solve the above issues, we propose AutoGPart, a generic method enabling
training generalizable 3D part segmentation networks with the task prior
considered. AutoGPart builds a supervision space with geometric prior knowledge
encoded, and lets the machine to search for the optimal supervisions from the
space for a specific segmentation task automatically. Extensive experiments on
three generalizable 3D part segmentation tasks are conducted to demonstrate the
effectiveness and versatility of AutoGPart. We demonstrate that the performance
of segmentation networks using simple backbones can be significantly improved
when trained with supervisions searched by our method.
Related papers
- A Unified Framework for 3D Scene Understanding [50.6762892022386]
UniSeg3D is a unified 3D segmentation framework that achieves panoptic, semantic, instance, interactive, referring, and open-vocabulary semantic segmentation tasks within a single model.
It facilitates inter-task knowledge sharing and promotes comprehensive 3D scene understanding.
Experiments on three benchmarks, including the ScanNet20, ScanRefer, and ScanNet200, demonstrate that the UniSeg3D consistently outperforms current SOTA methods.
arXiv Detail & Related papers (2024-07-03T16:50:07Z) - Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models [20.277479473218513]
We introduce a new task: Zero-Shot 3D Reasoning for parts searching and localization for objects.
We design a simple baseline method, Reasoning3D, with the capability to understand and execute complex commands.
We show that Reasoning3D can effectively localize and highlight parts of 3D objects based on implicit textual queries.
arXiv Detail & Related papers (2024-05-29T17:56:07Z) - Multi-task Learning with 3D-Aware Regularization [55.97507478913053]
We propose a structured 3D-aware regularizer which interfaces multiple tasks through the projection of features extracted from an image encoder to a shared 3D feature space.
We show that the proposed method is architecture agnostic and can be plugged into various prior multi-task backbones to improve their performance.
arXiv Detail & Related papers (2023-10-02T08:49:56Z) - Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters.
We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters.
The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z) - A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented,
Temporal and Depth-aware design [77.34726150561087]
We conduct a survey on the most relevant and recent advances in Deep Semantic in the context of vision for autonomous vehicles.
Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective.
arXiv Detail & Related papers (2023-03-08T01:29:55Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Multitask Network for Joint Object Detection, Semantic Segmentation and
Human Pose Estimation in Vehicle Occupancy Monitoring [0.0]
Multitask Detection, neural Pose and Estimation Network (DSPM)
We propose our Multitask Detection, neural Pose and Estimation Network (DSPM)
Our architecture allows a flexible combination of the three mentioned tasks during a simple end-to-end training.
We perform comprehensive evaluations on the public datasets SVIRO and TiCaM in order to demonstrate the superior performance.
arXiv Detail & Related papers (2022-05-03T14:11:18Z) - 3D Meta-Segmentation Neural Network [12.048487830494107]
We present a novel meta-learning strategy that regards the 3D shape segmentation function as a task.
By training over a number of 3D part segmentation tasks, our method is capable to learn the prior over the respective 3D segmentation function space.
We demonstrate that our model achieves superior part segmentation performance with the few-shot setting on the widely used dataset: ShapeNet.
arXiv Detail & Related papers (2021-10-08T01:47:54Z) - Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with
Deep Metric Learning [5.699350798684963]
We propose a simple, yet efficient algorithm for 3D instance segmentation using deep metric learning.
For high-level intelligent tasks from a large scale scene, 3D instance segmentation recognizes individual instances of objects.
We demonstrate the state-of-the-art performance of our algorithm in the ScanNet 3D instance segmentation benchmark on AP score.
arXiv Detail & Related papers (2020-07-07T02:17:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.