The Right Spin: Learning Object Motion from Rotation-Compensated Flow
Fields
- URL: http://arxiv.org/abs/2203.00115v1
- Date: Mon, 28 Feb 2022 22:05:09 GMT
- Title: The Right Spin: Learning Object Motion from Rotation-Compensated Flow
Fields
- Authors: Pia Bideau, Erik Learned-Miller, Cordelia Schmid, Karteek Alahari
- Abstract summary: How humans perceive moving objects is a longstanding research question in computer vision.
One approach to the problem is to teach a deep network to model all of these effects.
We present a novel probabilistic model to estimate the camera's rotation given the motion field.
- Score: 61.664963331203666
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Both a good understanding of geometrical concepts and a broad familiarity
with objects lead to our excellent perception of moving objects. The human
ability to detect and segment moving objects works in the presence of multiple
objects, complex background geometry, motion of the observer and even
camouflage. How humans perceive moving objects so reliably is a longstanding
research question in computer vision and borrows findings from related areas
such as psychology, cognitive science and physics. One approach to the problem
is to teach a deep network to model all of these effects. This contrasts with
the strategy used by human vision, where cognitive processes and body design
are tightly coupled and each is responsible for certain aspects of correctly
identifying moving objects. Similarly from the computer vision perspective,
there is evidence that classical, geometry-based techniques are better suited
to the "motion-based" parts of the problem, while deep networks are more
suitable for modeling appearance. In this work, we argue that the coupling of
camera rotation and camera translation can create complex motion fields that
are difficult for a deep network to untangle directly. We present a novel
probabilistic model to estimate the camera's rotation given the motion field.
We then rectify the flow field to obtain a rotation-compensated motion field
for subsequent segmentation. This strategy of first estimating camera motion,
and then allowing a network to learn the remaining parts of the problem, yields
improved results on the widely used DAVIS benchmark as well as the recently
published motion segmentation data set MoCA (Moving Camouflaged Animals).
Related papers
- Motion Segmentation from a Moving Monocular Camera [3.115818438802931]
We take advantage of two popular branches of monocular motion segmentation approaches: point trajectory based and optical flow based methods.
We are able to model various complex object motions in different scene structures at once.
Our method shows state-of-the-art performance on the KT3DMoSeg dataset.
arXiv Detail & Related papers (2023-09-24T22:59:05Z) - InstMove: Instance Motion for Object-centric Video Segmentation [70.16915119724757]
In this work, we study the instance-level motion and present InstMove, which stands for Instance Motion for Object-centric Video.
In comparison to pixel-wise motion, InstMove mainly relies on instance-level motion information that is free from image feature embeddings.
With only a few lines of code, InstMove can be integrated into current SOTA methods for three different video segmentation tasks.
arXiv Detail & Related papers (2023-03-14T17:58:44Z) - Unsupervised Multi-object Segmentation by Predicting Probable Motion
Patterns [92.80981308407098]
We propose a new approach to learn to segment multiple image objects without manual supervision.
The method can extract objects form still images, but uses videos for supervision.
We show state-of-the-art unsupervised object segmentation performance on simulated and real-world benchmarks.
arXiv Detail & Related papers (2022-10-21T17:57:05Z) - NeuralDiff: Segmenting 3D objects that move in egocentric videos [92.95176458079047]
We study the problem of decomposing the observed 3D scene into a static background and a dynamic foreground.
This task is reminiscent of the classic background subtraction problem, but is significantly harder because all parts of the scene, static and dynamic, generate a large apparent motion.
In particular, we consider egocentric videos and further separate the dynamic component into objects and the actor that observes and moves them.
arXiv Detail & Related papers (2021-10-19T12:51:35Z) - Attentive and Contrastive Learning for Joint Depth and Motion Field
Estimation [76.58256020932312]
Estimating the motion of the camera together with the 3D structure of the scene from a monocular vision system is a complex task.
We present a self-supervised learning framework for 3D object motion field estimation from monocular videos.
arXiv Detail & Related papers (2021-10-13T16:45:01Z) - JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion
Retargeting [53.28477676794658]
unsupervised motion in videos has seen substantial advancements through the use of deep neural networks.
We introduce JOKR - a JOint Keypoint Representation that handles both the source and target videos, without requiring any object prior or data collection.
We evaluate our method both qualitatively and quantitatively, and demonstrate that our method handles various cross-domain scenarios, such as different animals, different flowers, and humans.
arXiv Detail & Related papers (2021-06-17T17:32:32Z) - Learning Object Depth from Camera Motion and Video Object Segmentation [43.81711115175958]
This paper addresses the problem of learning to estimate the depth of segmented objects given some measurement of camera motion.
We create artificial object segmentations that are scaled for changes in distance between the camera and object, and our network learns to estimate object depth even with segmentation errors.
We demonstrate our approach across domains using a robot camera to locate objects from the YCB dataset and a vehicle camera to locate obstacles while driving.
arXiv Detail & Related papers (2020-07-11T03:50:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.