MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory
- URL: http://arxiv.org/abs/2311.01556v1
- Date: Thu, 2 Nov 2023 19:18:34 GMT
- Title: MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory
- Authors: Enxu Li, Sergio Casas, Raquel Urtasun
- Abstract summary: We propose a novel framework for semantic segmentation of temporal sequence of LiDAR point clouds.
We use a memory network to store, update and retrieve past information.
Our framework also includes a regularizer that penalizes prediction variations in the neighborhood of the point cloud.
- Score: 43.47217183838879
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation of LiDAR point clouds has been widely studied in recent
years, with most existing methods focusing on tackling this task using a single
scan of the environment. However, leveraging the temporal stream of
observations can provide very rich contextual information on regions of the
scene with poor visibility (e.g., occlusions) or sparse observations (e.g., at
long range), and can help reduce redundant computation frame after frame. In
this paper, we tackle the challenge of exploiting the information from the past
frames to improve the predictions of the current frame in an online fashion. To
address this challenge, we propose a novel framework for semantic segmentation
of a temporal sequence of LiDAR point clouds that utilizes a memory network to
store, update and retrieve past information. Our framework also includes a
regularizer that penalizes prediction variations in the neighborhood of the
point cloud. Prior works have attempted to incorporate memory in range view
representations for semantic segmentation, but these methods fail to handle
occlusions and the range view representation of the scene changes drastically
as agents nearby move. Our proposed framework overcomes these limitations by
building a sparse 3D latent representation of the surroundings. We evaluate our
method on SemanticKITTI, nuScenes, and PandaSet. Our experiments demonstrate
the effectiveness of the proposed framework compared to the state-of-the-art.
Related papers
- Coarse-to-Fine Proposal Refinement Framework for Audio Temporal Forgery Detection and Localization [60.899082019130766]
We introduce a frame-level detection network (FDN) and a proposal refinement network (PRN) for audio temporal forgery detection and localization.
FDN aims to mine informative inconsistency cues between real and fake frames to obtain discriminative features that are beneficial for roughly indicating forgery regions.
PRN is responsible for predicting confidence scores and regression offsets to refine the coarse-grained proposals derived from the FDN.
arXiv Detail & Related papers (2024-07-23T15:07:52Z) - Multi-modality Affinity Inference for Weakly Supervised 3D Semantic
Segmentation [47.81638388980828]
We propose a simple yet effective scene-level weakly supervised point cloud segmentation method with a newly introduced multi-modality point affinity inference module.
Our method outperforms the state-of-the-art by 4% to 6% mIoU on the ScanNet and S3DIS benchmarks.
arXiv Detail & Related papers (2023-12-27T14:01:35Z) - ContrastMotion: Self-supervised Scene Motion Learning for Large-Scale
LiDAR Point Clouds [21.6511040107249]
We propose a novel self-supervised motion estimator for LiDAR-based autonomous driving via BEV representation.
We predict scene motion via feature-level consistency between pillars in consecutive frames, which can eliminate the effect caused by noise points and view-changing point clouds in dynamic scenes.
arXiv Detail & Related papers (2023-04-25T05:46:24Z) - Rethinking Range View Representation for LiDAR Segmentation [66.73116059734788]
"Many-to-one" mapping, semantic incoherence, and shape deformation are possible impediments against effective learning from range view projections.
We present RangeFormer, a full-cycle framework comprising novel designs across network architecture, data augmentation, and post-processing.
We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks.
arXiv Detail & Related papers (2023-03-09T16:13:27Z) - Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic
Segmentation and Tracking [11.950994311766898]
We introduce the large-scale Panoptic nuScenes benchmark dataset that extends our popular nuScenes dataset.
We analyze the drawbacks of the existing metrics for panoptic tracking and propose the novel instance-centric PAT metric.
We believe that this extension will accelerate the research of novel methods for scene understanding of dynamic urban environments.
arXiv Detail & Related papers (2021-09-08T17:45:37Z) - LiDAR-based Recurrent 3D Semantic Segmentation with Temporal Memory
Alignment [0.0]
We propose a recurrent segmentation architecture (RNN), which takes a single range image frame as input.
An alignment strategy, which we call Temporal Memory Alignment, uses ego motion to temporally align the memory between consecutive frames in feature space.
We demonstrate the benefits of the presented approach on two large-scale datasets and compare it to several stateof-the-art methods.
arXiv Detail & Related papers (2021-03-03T09:01:45Z) - Spatiotemporal Graph Neural Network based Mask Reconstruction for Video
Object Segmentation [70.97625552643493]
This paper addresses the task of segmenting class-agnostic objects in semi-supervised setting.
We propose a novel graph neuralS network (TG-Net) which captures the local contexts by utilizing all proposals.
arXiv Detail & Related papers (2020-12-10T07:57:44Z) - Panoster: End-to-end Panoptic Segmentation of LiDAR Point Clouds [81.12016263972298]
We present Panoster, a novel proposal-free panoptic segmentation method for LiDAR point clouds.
Unlike previous approaches, Panoster proposes a simplified framework incorporating a learning-based clustering solution to identify instances.
At inference time, this acts as a class-agnostic segmentation, allowing Panoster to be fast, while outperforming prior methods in terms of accuracy.
arXiv Detail & Related papers (2020-10-28T18:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.