MaskFlow: Object-Aware Motion Estimation
- URL: http://arxiv.org/abs/2311.12476v1
- Date: Tue, 21 Nov 2023 09:37:49 GMT
- Title: MaskFlow: Object-Aware Motion Estimation
- Authors: Aria Ahmadi, David R. Walton, Tim Atherton, Cagatay Dikici
- Abstract summary: We introduce a novel motion estimation method, MaskFlow, that is capable of estimating accurate motion fields.
In addition to lower-level features, that are used in other Deep Neural Network (DNN)-based motion estimation methods, MaskFlow draws from object-level features and segmentations.
- Score: 0.45646200630189254
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We introduce a novel motion estimation method, MaskFlow, that is capable of
estimating accurate motion fields, even in very challenging cases with small
objects, large displacements and drastic appearance changes. In addition to
lower-level features, that are used in other Deep Neural Network (DNN)-based
motion estimation methods, MaskFlow draws from object-level features and
segmentations. These features and segmentations are used to approximate the
objects' translation motion field. We propose a novel and effective way of
incorporating the incomplete translation motion field into a subsequent motion
estimation network for refinement and completion. We also produced a new
challenging synthetic dataset with motion field ground truth, and also provide
extra ground truth for the object-instance matchings and corresponding
segmentation masks. We demonstrate that MaskFlow outperforms state of the art
methods when evaluated on our new challenging dataset, whilst still producing
comparable results on the popular FlyingThings3D benchmark dataset.
Related papers
- UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model [12.706915226843401]
UnSAMFlow is an unsupervised flow network that also leverages object information from the latest foundation model Segment Anything Model (SAM)
We analyze the poor gradient landscapes of traditional smoothness losses and propose a new smoothness definition based on homography instead.
Our method produces clear optical flow estimation with sharp boundaries around objects, which outperforms state-of-the-art methods on KITTI and Sintel datasets.
arXiv Detail & Related papers (2024-05-04T08:27:12Z) - Neuromorphic Vision-based Motion Segmentation with Graph Transformer Neural Network [4.386534439007928]
We propose a novel event-based motion segmentation algorithm using a Graph Transformer Neural Network, dubbed GTNN.
Our proposed algorithm processes event streams as 3D graphs by a series nonlinear transformations to unveil local and global correlations between events.
We show that GTNN outperforms state-of-the-art methods in the presence of dynamic background variations, motion patterns, and multiple dynamic objects with varying sizes and velocities.
arXiv Detail & Related papers (2024-04-16T22:44:29Z) - Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast
Contrastive Fusion [110.84357383258818]
We propose a novel approach to lift 2D segments to 3D and fuse them by means of a neural field representation.
The core of our approach is a slow-fast clustering objective function, which is scalable and well-suited for scenes with a large number of objects.
Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets.
arXiv Detail & Related papers (2023-06-07T17:57:45Z) - AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural
Representation [17.501820140334328]
We introduce AnyFlow, a robust network that estimates accurate flow from images of various resolutions.
We establish a new state-of-the-art performance of cross-dataset generalization on the KITTI dataset.
arXiv Detail & Related papers (2023-03-29T07:03:51Z) - Self-Improving SLAM in Dynamic Environments: Learning When to Mask [5.4310785842119795]
We propose a novel SLAM that learns when masking objects improves its performance in dynamic scenarios.
We do not make any priors on motion: our method learns to mask moving objects by itself.
Our method reaches the state of the art on the TUM RGB-D dataset and outperforms it on KITTI and ConsInv datasets.
arXiv Detail & Related papers (2022-10-15T18:06:06Z) - Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value
Functions [65.84090965167535]
We present Neural Motion Fields, a novel object representation which encodes both object point clouds and the relative task trajectories as an implicit value function parameterized by a neural network.
This object-centric representation models a continuous distribution over the SE(3) space and allows us to perform grasping reactively by leveraging sampling-based MPC to optimize this value function.
arXiv Detail & Related papers (2022-06-29T18:47:05Z) - ImpDet: Exploring Implicit Fields for 3D Object Detection [74.63774221984725]
We introduce a new perspective that views bounding box regression as an implicit function.
This leads to our proposed framework, termed Implicit Detection or ImpDet.
Our ImpDet assigns specific values to points in different local 3D spaces, thereby high-quality boundaries can be generated.
arXiv Detail & Related papers (2022-03-31T17:52:12Z) - DetFlowTrack: 3D Multi-object Tracking based on Simultaneous
Optimization of Object Detection and Scene Flow Estimation [23.305159598648924]
We propose a 3D MOT framework based on simultaneous optimization of object detection and scene flow estimation.
For more accurate scene flow label especially in the case of motion with rotation, a box-transformation-based scene flow ground truth calculation method is proposed.
Experimental results on the KITTI MOT dataset show competitive results over the state-of-the-arts and the robustness under extreme motion with rotation.
arXiv Detail & Related papers (2022-03-04T07:06:47Z) - Learning to Segment Rigid Motions from Two Frames [72.14906744113125]
We propose a modular network, motivated by a geometric analysis of what independent object motions can be recovered from an egomotion field.
It takes two consecutive frames as input and predicts segmentation masks for the background and multiple rigidly moving objects, which are then parameterized by 3D rigid transformations.
Our method achieves state-of-the-art performance for rigid motion segmentation on KITTI and Sintel.
arXiv Detail & Related papers (2021-01-11T04:20:30Z) - DOT: Dynamic Object Tracking for Visual SLAM [83.69544718120167]
DOT combines instance segmentation and multi-view geometry to generate masks for dynamic objects.
To determine which objects are actually moving, DOT segments first instances of potentially dynamic objects and then, with the estimated camera motion, tracks such objects by minimizing the photometric reprojection error.
Our results show that our approach improves significantly the accuracy and robustness of ORB-SLAM 2, especially in highly dynamic scenes.
arXiv Detail & Related papers (2020-09-30T18:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.