Fusing RGBD Tracking and Segmentation Tree Sampling for Multi-Hypothesis
Volumetric Segmentation
- URL: http://arxiv.org/abs/2104.00205v1
- Date: Thu, 1 Apr 2021 02:17:18 GMT
- Title: Fusing RGBD Tracking and Segmentation Tree Sampling for Multi-Hypothesis
Volumetric Segmentation
- Authors: Andrew Price, Kun Huang, Dmitry Berenson
- Abstract summary: Multihypothesis Tracking (MST) is a novel method for volumetric segmentation in changing scenes.
Two main innovations allow us to tackle this difficult problem.
We evaluate our method on several cluttered tabletop environments in simulation and reality.
- Score: 6.853379171946806
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite rapid progress in scene segmentation in recent years, 3D segmentation
methods are still limited when there is severe occlusion. The key challenge is
estimating the segment boundaries of (partially) occluded objects, which are
inherently ambiguous when considering only a single frame. In this work, we
propose Multihypothesis Segmentation Tracking (MST), a novel method for
volumetric segmentation in changing scenes, which allows scene ambiguity to be
tracked and our estimates to be adjusted over time as we interact with the
scene. Two main innovations allow us to tackle this difficult problem: 1) A
novel way to sample possible segmentations from a segmentation tree; and 2) A
novel approach to fusing tracking results with multiple segmentation estimates.
These methods allow MST to track the segmentation state over time and
incorporate new information, such as new objects being revealed. We evaluate
our method on several cluttered tabletop environments in simulation and
reality. Our results show that MST outperforms baselines in all tested scenes.
Related papers
- Image Segmentation in Foundation Model Era: A Survey [99.19456390358211]
Current research in image segmentation lacks a detailed analysis of distinct characteristics, challenges, and solutions associated with these advancements.
This survey seeks to fill this gap by providing a thorough review of cutting-edge research centered around FM-driven image segmentation.
An exhaustive overview of over 300 segmentation approaches is provided to encapsulate the breadth of current research efforts.
arXiv Detail & Related papers (2024-08-23T10:07:59Z) - Temporal Segment Transformer for Action Segmentation [54.25103250496069]
We propose an attention based approach which we call textittemporal segment transformer, for joint segment relation modeling and denoising.
The main idea is to denoise segment representations using attention between segment and frame representations, and also use inter-segment attention to capture temporal correlations between segments.
We show that this novel architecture achieves state-of-the-art accuracy on the popular 50Salads, GTEA and Breakfast benchmarks.
arXiv Detail & Related papers (2023-02-25T13:05:57Z) - A Closer Look at Temporal Ordering in the Segmentation of Instructional
Videos [17.712793578388126]
We take a closer look at Procedure and Summarization (PSS) and propose three fundamental improvements over current methods.
We propose a new segmentation metric based on dynamic programming that takes into account the order of segments.
We propose a matching algorithm that constrains the temporal order of segment mapping, and is also differentiable.
arXiv Detail & Related papers (2022-09-30T14:44:19Z) - SOLO: A Simple Framework for Instance Segmentation [84.00519148562606]
"instance categories" assigns categories to each pixel within an instance according to the instance's location.
"SOLO" is a simple, direct, and fast framework for instance segmentation with strong performance.
Our approach achieves state-of-the-art results for instance segmentation in terms of both speed and accuracy.
arXiv Detail & Related papers (2021-06-30T09:56:54Z) - Prototypical Cross-Attention Networks for Multiple Object Tracking and
Segmentation [95.74244714914052]
Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes.
We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich-temporal information online.
PCAN outperforms current video instance tracking and segmentation competition winners on Youtube-VIS and BDD100K datasets.
arXiv Detail & Related papers (2021-06-22T17:57:24Z) - Contextual Guided Segmentation Framework for Semi-supervised Video
Instance Segmentation [20.174393465900156]
We propose Contextual Guided (CGS) framework for video instance segmentation in three passes.
In the first pass, i.e., preview segmentation, we propose Instance Re-Identification Flow to estimate main properties of each instance.
In the second pass, i.e., contextual segmentation, we introduce multiple contextual segmentation schemes.
Experiments conducted on the DAVIS Test-Challenge dataset demonstrate the effectiveness of our proposed framework.
arXiv Detail & Related papers (2021-06-07T04:16:50Z) - Exposing Semantic Segmentation Failures via Maximum Discrepancy
Competition [102.75463782627791]
We take steps toward answering the question by exposing failures of existing semantic segmentation methods in the open visual world.
Inspired by previous research on model falsification, we start from an arbitrarily large image set, and automatically sample a small image set by MAximizing the Discrepancy (MAD) between two segmentation methods.
The selected images have the greatest potential in falsifying either (or both) of the two methods.
A segmentation method, whose failures are more difficult to be exposed in the MAD competition, is considered better.
arXiv Detail & Related papers (2021-02-27T16:06:25Z) - A Three-Stage Self-Training Framework for Semi-Supervised Semantic
Segmentation [0.9786690381850356]
We propose a holistic solution framed as a three-stage self-training framework for semantic segmentation.
The key idea of our technique is the extraction of the pseudo-masks statistical information.
We then decrease the uncertainty of the pseudo-masks using a multi-task model that enforces consistency.
arXiv Detail & Related papers (2020-12-01T21:00:27Z) - MS-TCN++: Multi-Stage Temporal Convolutional Network for Action
Segmentation [87.16030562892537]
We propose a multi-stage architecture for the temporal action segmentation task.
The first stage generates an initial prediction that is refined by the next ones.
Our models achieve state-of-the-art results on three datasets.
arXiv Detail & Related papers (2020-06-16T14:50:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.