Related papers: HiMo: High-Speed Objects Motion Compensation in Point Clouds

HiMo: High-Speed Objects Motion Compensation in Point Clouds

URL: http://arxiv.org/abs/2503.00803v2
Date: Sun, 27 Apr 2025 22:43:14 GMT
Title: HiMo: High-Speed Objects Motion Compensation in Point Clouds
Authors: Qingwen Zhang, Ajinkya Khoche, Yi Yang, Li Ling, Sina Sharif Mansouri, Olov Andersson, Patric Jensfelt,
Abstract summary: HiMo is a pipeline that repurposes scene flow estimation for non-ego motion compensation.<n>SeFlow++ is a real-time scene flow estimator that achieves state-of-the-art performance on both scene flow and motion compensation.<n>Our findings show that HiMo improves the geometric consistency and visual fidelity of dynamic objects in LiDAR point clouds.
Score: 18.617901304679812
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: LiDAR point cloud is essential for autonomous vehicles, but motion distortions from dynamic objects degrade the data quality. While previous work has considered distortions caused by ego motion, distortions caused by other moving objects remain largely overlooked, leading to errors in object shape and position. This distortion is particularly pronounced in high-speed environments such as highways and in multi-LiDAR configurations, a common setup for heavy vehicles. To address this challenge, we introduce HiMo, a pipeline that repurposes scene flow estimation for non-ego motion compensation, correcting the representation of dynamic objects in point clouds. During the development of HiMo, we observed that existing self-supervised scene flow estimators often produce degenerate or inconsistent estimates under high-speed distortion. We further propose SeFlow++, a real-time scene flow estimator that achieves state-of-the-art performance on both scene flow and motion compensation. Since well-established motion distortion metrics are absent in the literature, we introduce two evaluation metrics: compensation accuracy at a point level and shape similarity of objects. We validate HiMo through extensive experiments on Argoverse 2, ZOD, and a newly collected real-world dataset featuring highway driving and multi-LiDAR-equipped heavy vehicles. Our findings show that HiMo improves the geometric consistency and visual fidelity of dynamic objects in LiDAR point clouds, benefiting downstream tasks such as semantic segmentation and 3D detection. See https://kin-zhang.github.io/HiMo for more details.

Related papers

TARS: Traffic-Aware Radar Scene Flow Estimation [7.031882453765095]
Scene flow provides crucial motion information for autonomous driving.<n>Recent LiDAR scene flow models utilize the rigid-motion assumption at the instance level.<n>We present a novel Traffic-Aware Radar Scene-Flow estimation method.
arXiv Detail & Related papers (2025-03-13T09:54:08Z)
C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation [81.4106601222722]
Trajectory-based motion control has emerged as an intuitive and efficient approach for controllable video generation. We propose a Chain-of-Thought-based motion controller for controllable video generation, named C-Drag. Our method includes an object perception module and a Chain-of-Thought-based motion reasoning module.
arXiv Detail & Related papers (2025-02-27T08:21:03Z)
Degrees of Freedom Matter: Inferring Dynamics from Point Trajectories [28.701879490459675]
We aim to learn an implicit motion field parameterized by a neural network to predict the movement of novel points within same domain. We exploit intrinsic regularization provided by SIREN, and modify the input layer to produce atemporally smooth motion field. Our experiments assess the model's performance in predicting unseen point trajectories and its application in temporal mesh alignment with deformation.
arXiv Detail & Related papers (2024-06-05T21:02:10Z)
Instantaneous Perception of Moving Objects in 3D [86.38144604783207]
The perception of 3D motion of surrounding traffic participants is crucial for driving safety. We propose to leverage local occupancy completion of object point clouds to densify the shape cue, and mitigate the impact of swimming artifacts. Extensive experiments demonstrate superior performance compared to standard 3D motion estimation approaches.
arXiv Detail & Related papers (2024-05-05T01:07:24Z)
DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and Depth from Monocular Videos [76.01906393673897]
We propose a self-supervised method to jointly learn 3D motion and depth from monocular videos. Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion. Our model delivers superior performance in all evaluated settings.
arXiv Detail & Related papers (2024-03-09T12:22:46Z)
The Drunkard's Odometry: Estimating Camera Motion in Deforming Scenes [79.00228778543553]
This dataset is the first large set of exploratory camera trajectories with ground truth inside 3D scenes. Simulations in realistic 3D buildings lets us obtain a vast amount of data and ground truth labels. We present a novel deformable odometry method, dubbed the Drunkard's Odometry, which decomposes optical flow estimates into rigid-body camera motion.
arXiv Detail & Related papers (2023-06-29T13:09:31Z)
MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor. Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z)
Aligning Bird-Eye View Representation of Point Cloud Sequences using Scene Flow [0.0]
Low-resolution point clouds are challenging for object detection methods due to their sparsity. We develop a plug-in module that enables single-frame detectors to compute scene flow to rectify their Bird-Eye View representation.
arXiv Detail & Related papers (2023-05-04T15:16:21Z)
DEFLOW: Self-supervised 3D Motion Estimation of Debris Flow [19.240172015210586]
We propose DEFLOW, a model for 3D motion estimation of debris flows. We adopt a novel multi-level sensor fusion architecture and self-supervision to incorporate the inductive biases of the scene. Our model achieves state-of-the-art optical flow and depth estimation on our dataset, and fully automates the motion estimation for debris flows.
arXiv Detail & Related papers (2023-04-05T16:40:14Z)
DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem. DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden. It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z)
Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking [32.32109475782992]
We show that a simple motion model can obtain state-of-the-art tracking performance without other cues like appearance. We thus name the proposed method as Observation-Centric SORT, OC-SORT for short.
arXiv Detail & Related papers (2022-03-27T17:57:08Z)
Lidar with Velocity: Motion Distortion Correction of Point Clouds from Oscillating Scanning Lidars [5.285472406047901]
Lidar point cloud distortion from moving object is an important problem in autonomous driving. Gustafson-based lidar and camera fusion is proposed to estimate the full velocity and correct the lidar distortion. The framework is evaluated on real road data and the fusion method outperforms the traditional ICP-based and point-cloud only method.
arXiv Detail & Related papers (2021-11-18T03:13:08Z)
LiMoSeg: Real-time Bird's Eye View based LiDAR Motion Segmentation [8.184561295177623]
This paper proposes a novel real-time architecture for motion segmentation of Light Detection and Ranging (LiDAR) data. We use two successive scans of LiDAR data in 2D Bird's Eye View representation to perform pixel-wise classification as static or moving. We demonstrate a low latency of 8 ms on a commonly used automotive embedded platform, namely Nvidia Jetson Xavier.
arXiv Detail & Related papers (2021-11-08T23:40:55Z)
Attentive and Contrastive Learning for Joint Depth and Motion Field Estimation [76.58256020932312]
Estimating the motion of the camera together with the 3D structure of the scene from a monocular vision system is a complex task. We present a self-supervised learning framework for 3D object motion field estimation from monocular videos.
arXiv Detail & Related papers (2021-10-13T16:45:01Z)
Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather [92.84066576636914]
This work addresses the challenging task of LiDAR-based 3D object detection in foggy weather. We tackle this problem by simulating physically accurate fog into clear-weather scenes. We are the first to provide strong 3D object detection baselines on the Seeing Through Fog dataset.
arXiv Detail & Related papers (2021-08-11T14:37:54Z)
Scalable Scene Flow from Point Clouds in the Real World [30.437100097997245]
We introduce a new large scale benchmark for scene flow based on the Open dataset. We show how previous works were bounded based on the amount of real LiDAR data available. We introduce the model architecture FastFlow3D that provides real time inference on the full point cloud.
arXiv Detail & Related papers (2021-03-01T20:56:05Z)
A Flow Base Bi-path Network for Cross-scene Video Crowd Understanding in Aerial View [93.23947591795897]
In this paper, we strive to tackle the challenges and automatically understand the crowd from the visual data collected from drones. To alleviate the background noise generated in cross-scene testing, a double-stream crowd counting model is proposed. To tackle the crowd density estimation problem under extreme dark environments, we introduce synthetic data generated by game Grand Theft Auto V(GTAV)
arXiv Detail & Related papers (2020-09-29T01:48:24Z)
Any Motion Detector: Learning Class-agnostic Scene Dynamics from a Sequence of LiDAR Point Clouds [4.640835690336654]
We propose a novel real-time approach of temporal context aggregation for motion detection and motion parameters estimation. We introduce an ego-motion compensation layer to achieve real-time inference with performance comparable to a naive odometric transform of the original point cloud sequence.
arXiv Detail & Related papers (2020-04-24T10:40:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.