Baking in the Feature: Accelerating Volumetric Segmentation by Rendering
Feature Maps
- URL: http://arxiv.org/abs/2209.12744v1
- Date: Mon, 26 Sep 2022 14:52:10 GMT
- Title: Baking in the Feature: Accelerating Volumetric Segmentation by Rendering
Feature Maps
- Authors: Kenneth Blomqvist, Lionel Ott, Jen Jen Chung, Roland Siegwart
- Abstract summary: We propose to use features extracted with models trained on large existing datasets to improve segmentation performance.
We bake this feature representation into a Neural Radiance Field (NeRF) by volumetrically rendering feature maps and supervising on features extracted from each input image.
Our experiments show that our method achieves higher segmentation accuracy with fewer semantic annotations than existing methods over a wide range of scenes.
- Score: 42.34064154798376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Methods have recently been proposed that densely segment 3D volumes into
classes using only color images and expert supervision in the form of sparse
semantically annotated pixels. While impressive, these methods still require a
relatively large amount of supervision and segmenting an object can take
several minutes in practice. Such systems typically only optimize their
representation on the particular scene they are fitting, without leveraging any
prior information from previously seen images. In this paper, we propose to use
features extracted with models trained on large existing datasets to improve
segmentation performance. We bake this feature representation into a Neural
Radiance Field (NeRF) by volumetrically rendering feature maps and supervising
on features extracted from each input image. We show that by baking this
representation into the NeRF, we make the subsequent classification task much
easier. Our experiments show that our method achieves higher segmentation
accuracy with fewer semantic annotations than existing methods over a wide
range of scenes.
Related papers
- View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields [52.08335264414515]
We learn a novel feature field within a Neural Radiance Field (NeRF) representing a 3D scene.
Our method takes view-inconsistent multi-granularity 2D segmentations as input and produces a hierarchy of 3D-consistent segmentations as output.
We evaluate our method and several baselines on synthetic datasets with multi-view images and multi-granular segmentation, showcasing improved accuracy and viewpoint-consistency.
arXiv Detail & Related papers (2024-05-30T04:14:58Z) - Few-shot Multispectral Segmentation with Representations Generated by Reinforcement Learning [0.0]
We propose a novel approach for improving few-shot segmentation performance on multispectral images using reinforcement learning.
Our methodology involves training an agent to identify the most informative expressions using a small dataset.
Due to the limited length of the expressions, the model receives useful representations without any added risk of overfitting.
arXiv Detail & Related papers (2023-11-20T15:04:16Z) - Learning Semantic Segmentation with Query Points Supervision on Aerial Images [57.09251327650334]
We present a weakly supervised learning algorithm to train semantic segmentation algorithms.
Our proposed approach performs accurate semantic segmentation and improves efficiency by significantly reducing the cost and time required for manual annotation.
arXiv Detail & Related papers (2023-09-11T14:32:04Z) - SLiMe: Segment Like Me [24.254744102347413]
We propose SLiMe to segment images at any desired granularity using as few as one annotated sample.
We carried out a knowledge-rich set of experiments examining various design factors and showed that SLiMe outperforms other existing one-shot and few-shot segmentation methods.
arXiv Detail & Related papers (2023-09-06T17:39:05Z) - Disambiguation of One-Shot Visual Classification Tasks: A Simplex-Based
Approach [8.436437583394998]
We present a strategy which aims at detecting the presence of multiple objects in a given shot.
This strategy is based on identifying the corners of a simplex in a high dimensional space.
We show the ability of the proposed method to slightly, yet statistically significantly, improve accuracy in extreme settings.
arXiv Detail & Related papers (2023-01-16T11:37:05Z) - Self-attention on Multi-Shifted Windows for Scene Segmentation [14.47974086177051]
We explore the effective use of self-attention within multi-scale image windows to learn descriptive visual features.
We propose three different strategies to aggregate these feature maps to decode the feature representation for dense prediction.
Our models achieve very promising performance on four public scene segmentation datasets.
arXiv Detail & Related papers (2022-07-10T07:36:36Z) - Rethinking Interactive Image Segmentation: Feature Space Annotation [68.8204255655161]
We propose interactive and simultaneous segment annotation from multiple images guided by feature space projection.
We show that our approach can surpass the accuracy of state-of-the-art methods in foreground segmentation datasets.
arXiv Detail & Related papers (2021-01-12T10:13:35Z) - Saliency-driven Class Impressions for Feature Visualization of Deep
Neural Networks [55.11806035788036]
It is advantageous to visualize the features considered to be essential for classification.
Existing visualization methods develop high confidence images consisting of both background and foreground features.
In this work, we propose a saliency-driven approach to visualize discriminative features that are considered most important for a given task.
arXiv Detail & Related papers (2020-07-31T06:11:06Z) - Improving Semantic Segmentation via Decoupled Body and Edge Supervision [89.57847958016981]
Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion.
In this paper, a new paradigm for semantic segmentation is proposed.
Our insight is that appealing performance of semantic segmentation requires textitexplicitly modeling the object textitbody and textitedge, which correspond to the high and low frequency of the image.
We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries.
arXiv Detail & Related papers (2020-07-20T12:11:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.