Lidar Panoptic Segmentation and Tracking without Bells and Whistles
- URL: http://arxiv.org/abs/2310.12464v1
- Date: Thu, 19 Oct 2023 04:44:43 GMT
- Title: Lidar Panoptic Segmentation and Tracking without Bells and Whistles
- Authors: Abhinav Agarwalla, Xuhua Huang, Jason Ziglar, Francesco Ferroni, Laura
Leal-Taix\'e, James Hays, Aljo\v{s}a O\v{s}ep, Deva Ramanan
- Abstract summary: We propose a detection-centric network for lidar segmentation and tracking.
One of the core components of our network is the object instance detection branch.
We evaluate our method on several 3D/4D LPS benchmarks and observe that our model establishes a new state-of-the-art among open-sourced models.
- Score: 48.078270195629415
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art lidar panoptic segmentation (LPS) methods follow bottom-up
segmentation-centric fashion wherein they build upon semantic segmentation
networks by utilizing clustering to obtain object instances. In this paper, we
re-think this approach and propose a surprisingly simple yet effective
detection-centric network for both LPS and tracking. Our network is modular by
design and optimized for all aspects of both the panoptic segmentation and
tracking task. One of the core components of our network is the object instance
detection branch, which we train using point-level (modal) annotations, as
available in segmentation-centric datasets. In the absence of amodal (cuboid)
annotations, we regress modal centroids and object extent using
trajectory-level supervision that provides information about object size, which
cannot be inferred from single scans due to occlusions and the sparse nature of
the lidar data. We obtain fine-grained instance segments by learning to
associate lidar points with detected centroids. We evaluate our method on
several 3D/4D LPS benchmarks and observe that our model establishes a new
state-of-the-art among open-sourced models, outperforming recent query-based
models.
Related papers
- SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Segment Anything Meets Point Tracking [116.44931239508578]
This paper presents a novel method for point-centric interactive video segmentation, empowered by SAM and long-term point tracking.
We highlight the merits of point-based tracking through direct evaluation on the zero-shot open-world Unidentified Video Objects (UVO) benchmark.
Our experiments on popular video object segmentation and multi-object segmentation tracking benchmarks, including DAVIS, YouTube-VOS, and BDD100K, suggest that a point-based segmentation tracker yields better zero-shot performance and efficient interactions.
arXiv Detail & Related papers (2023-07-03T17:58:01Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - CPSeg: Cluster-free Panoptic Segmentation of 3D LiDAR Point Clouds [2.891413712995641]
We propose a novel real-time end-to-end panoptic segmentation network for LiDAR point clouds, called CPSeg.
CPSeg comprises a shared encoder, a dual decoder, a task-aware attention module (TAM) and a cluster-free instance segmentation head.
arXiv Detail & Related papers (2021-11-02T16:44:06Z) - Learn to Learn Metric Space for Few-Shot Segmentation of 3D Shapes [17.217954254022573]
We introduce a meta-learning-based method for few-shot 3D shape segmentation where only a few labeled samples are provided for the unseen classes.
We demonstrate the superior performance of our proposed on the ShapeNet part dataset under the few-shot scenario, compared with well-established baseline and state-of-the-art semi-supervised methods.
arXiv Detail & Related papers (2021-07-07T01:47:00Z) - Learning from Counting: Leveraging Temporal Classification for Weakly
Supervised Object Localization and Detection [4.971083368517706]
We introduce scan-order techniques to serialize 2D images into 1D sequence data.
We then leverage a combined LSTM (Long, Short-Term Memory) and CTC network to achieve object localization.
arXiv Detail & Related papers (2021-03-06T02:18:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.