iLabel: Interactive Neural Scene Labelling
- URL: http://arxiv.org/abs/2111.14637v1
- Date: Mon, 29 Nov 2021 15:49:20 GMT
- Title: iLabel: Interactive Neural Scene Labelling
- Authors: Shuaifeng Zhi and Edgar Sucar and Andre Mouton and Iain Haughton and
Tristan Laidlow and Andrew J. Davison
- Abstract summary: Joint representation of geometry, colour and semantics using a 3D neural field enables accurate dense labelling from ultra-sparse interactions.
Our iLabel system requires no training data, yet can densely label scenes more accurately than standard methods.
It works in an 'open set' manner, with semantic classes defined on the fly by the user.
- Score: 20.63756683450811
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Joint representation of geometry, colour and semantics using a 3D neural
field enables accurate dense labelling from ultra-sparse interactions as a user
reconstructs a scene in real-time using a handheld RGB-D sensor. Our iLabel
system requires no training data, yet can densely label scenes more accurately
than standard methods trained on large, expensively labelled image datasets.
Furthermore, it works in an 'open set' manner, with semantic classes defined on
the fly by the user.
iLabel's underlying model is a multilayer perceptron (MLP) trained from
scratch in real-time to learn a joint neural scene representation. The scene
model is updated and visualised in real-time, allowing the user to focus
interactions to achieve efficient labelling. A room or similar scene can be
accurately labelled into 10+ semantic categories with only a few tens of
clicks. Quantitative labelling accuracy scales powerfully with the number of
clicks, and rapidly surpasses standard pre-trained semantic segmentation
methods. We also demonstrate a hierarchical labelling variant.
Related papers
- LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories [59.14011485494713]
This work introduces a fully automated 2D/3D labeling framework that can generate labels for RGB-D scans at equal (or better) level of accuracy.
We demonstrate the effectiveness of our LabelMaker pipeline by generating significantly better labels for the ScanNet datasets and automatically labelling the previously unlabeled ARKitScenes dataset.
arXiv Detail & Related papers (2023-11-20T20:40:24Z) - Distilling Self-Supervised Vision Transformers for Weakly-Supervised
Few-Shot Classification & Segmentation [58.03255076119459]
We address the task of weakly-supervised few-shot image classification and segmentation, by leveraging a Vision Transformer (ViT)
Our proposed method takes token representations from the self-supervised ViT and leverages their correlations, via self-attention, to produce classification and segmentation predictions.
Experiments on Pascal-5i and COCO-20i demonstrate significant performance gains in a variety of supervision settings.
arXiv Detail & Related papers (2023-07-07T06:16:43Z) - You Only Need One Thing One Click: Self-Training for Weakly Supervised
3D Scene Understanding [107.06117227661204]
We propose One Thing One Click'', meaning that the annotator only needs to label one point per object.
We iteratively conduct the training and label propagation, facilitated by a graph propagation module.
Our model can be compatible to 3D instance segmentation equipped with a point-clustering strategy.
arXiv Detail & Related papers (2023-03-26T13:57:00Z) - One Thing One Click: A Self-Training Approach for Weakly Supervised 3D
Semantic Segmentation [78.36781565047656]
We propose "One Thing One Click," meaning that the annotator only needs to label one point per object.
We iteratively conduct the training and label propagation, facilitated by a graph propagation module.
Our results are also comparable to those of the fully supervised counterparts.
arXiv Detail & Related papers (2021-04-06T02:27:25Z) - In-Place Scene Labelling and Understanding with Implicit Scene
Representation [39.73806072862176]
We extend neural radiance fields (NeRF) to jointly encode semantics with appearance and geometry.
We show the benefit of this approach when labels are either sparse or very noisy in room-scale scenes.
arXiv Detail & Related papers (2021-03-29T18:30:55Z) - Label Confusion Learning to Enhance Text Classification Models [3.0251266104313643]
Label Confusion Model (LCM) learns label confusion to capture semantic overlap among labels.
LCM can generate a better label distribution to replace the original one-hot label vector.
experiments on five text classification benchmark datasets reveal the effectiveness of LCM for several widely used deep learning classification models.
arXiv Detail & Related papers (2020-12-09T11:34:35Z) - Knowledge-Guided Multi-Label Few-Shot Learning for General Image
Recognition [75.44233392355711]
KGGR framework exploits prior knowledge of statistical label correlations with deep neural networks.
It first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence.
Then, it introduces the label semantics to guide learning semantic-specific features.
It exploits a graph propagation network to explore graph node interactions.
arXiv Detail & Related papers (2020-09-20T15:05:29Z) - RGB-based Semantic Segmentation Using Self-Supervised Depth Pre-Training [77.62171090230986]
We propose an easily scalable and self-supervised technique that can be used to pre-train any semantic RGB segmentation method.
In particular, our pre-training approach makes use of automatically generated labels that can be obtained using depth sensors.
We show how our proposed self-supervised pre-training with HN-labels can be used to replace ImageNet pre-training.
arXiv Detail & Related papers (2020-02-06T11:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.