Related papers: Motor Focus: Fast Ego-Motion Prediction for Assistive Visual Navigation

Motor Focus: Fast Ego-Motion Prediction for Assistive Visual Navigation

URL: http://arxiv.org/abs/2404.17031v2
Date: Sat, 12 Oct 2024 21:08:05 GMT
Title: Motor Focus: Fast Ego-Motion Prediction for Assistive Visual Navigation
Authors: Hao Wang, Jiayou Qin, Xiwen Chen, Ashish Bastola, John Suchanek, Zihao Gong, Abolfazl Razi,
Abstract summary: Motor Focus is an image-based framework that predicts the observer's motion direction based on their visual feeds. Our framework demonstrates its superiority in speed (> 40FPS), accuracy (MAE = 60pixels), and robustness (SNR = 23dB)
Score: 3.837186701755568
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Assistive visual navigation systems for visually impaired individuals have become increasingly popular thanks to the rise of mobile computing. Most of these devices work by translating visual information into voice commands. In complex scenarios where multiple objects are present, it is imperative to prioritize object detection and provide immediate notifications for key entities in specific directions. This brings the need for identifying the observer's motion direction (ego-motion) by merely processing visual information, which is the key contribution of this paper. Specifically, we introduce Motor Focus, a lightweight image-based framework that predicts the ego-motion - the humans (and humanoid machines) movement intentions based on their visual feeds, while filtering out camera motion without any camera calibration. To this end, we implement an optical flow-based pixel-wise temporal analysis method to compensate for the camera motion with a Gaussian aggregation to smooth out the movement prediction area. Subsequently, to evaluate the performance, we collect a dataset including 50 clips of pedestrian scenes in 5 different scenarios. We tested this framework with classical feature detectors such as SIFT and ORB to show the comparison. Our framework demonstrates its superiority in speed (> 40FPS), accuracy (MAE = 60pixels), and robustness (SNR = 23dB), confirming its potential to enhance the usability of vision-based assistive navigation tools in complex environments.

Related papers

Segment Any Motion in Videos [80.72424676419755]
We propose a novel approach for moving object segmentation that combines long-range trajectory motion cues with DINO-based semantic features. Our model employs Spatio-Temporal Trajectory Attention and Motion-Semantic Decoupled Embedding to prioritize motion while integrating semantic support.
arXiv Detail & Related papers (2025-03-28T09:34:11Z)
ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking [4.250337979548885]
We propose a motion-based MOT approach with an enhanced temporal motion predictor, ETTrack. Specifically, the motion predictor integrates a transformer model and a Temporal Convolutional Network (TCN) to capture short-term and long-term motion patterns. We show ETTrack achieves a competitive performance compared with state-of-the-art trackers on DanceTrack and SportsMOT.
arXiv Detail & Related papers (2024-05-24T17:51:33Z)
Motion Segmentation for Neuromorphic Aerial Surveillance [42.04157319642197]
Event cameras offer superior temporal resolution, superior dynamic range, and minimal power requirements. Unlike traditional frame-based sensors that capture redundant information at fixed intervals, event cameras asynchronously record pixel-level brightness changes. We introduce a novel motion segmentation method that leverages self-supervised vision transformers on both event data and optical flow information.
arXiv Detail & Related papers (2024-05-24T04:36:13Z)
Ego-Motion Aware Target Prediction Module for Robust Multi-Object Tracking [2.7898966850590625]
We introduce a novel KF-based prediction module called Ego-motion Aware Target Prediction (EMAP) Our proposed method decouples the impact of camera rotational and translational velocity from the object trajectories by reformulating the Kalman Filter. EMAP remarkably drops the number of identity switches (IDSW) of OC-SORT and Deep OC-SORT by 73% and 21%, respectively.
arXiv Detail & Related papers (2024-04-03T23:24:25Z)
Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation [17.71871884366252]
Video object segmentation (VOS) aims to detect the most salient object in a video without external guidance about the object. Recent methods collaboratively use motion cues extracted from optical flow maps with appearance cues extracted from RGB images. We propose a novel motion-as-option network by treating motion cues as optional.
arXiv Detail & Related papers (2023-09-26T09:34:13Z)
Event-Free Moving Object Segmentation from Moving Ego Vehicle [88.33470650615162]
Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving. Most segmentation methods leverage motion cues obtained from optical flow maps. We propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow.
arXiv Detail & Related papers (2023-04-28T23:43:10Z)
DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem. DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden. It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z)
PL-EVIO: Robust Monocular Event-based Visual Inertial Odometry with Point and Line Features [3.6355269783970394]
Event cameras are motion-activated sensors that capture pixel-level illumination changes instead of the intensity image with a fixed frame rate. We propose a robust, high-accurate, and real-time optimization-based monocular event-based visual-inertial odometry (VIO) method.
arXiv Detail & Related papers (2022-09-25T06:14:12Z)
PreViTS: Contrastive Pretraining with Video Tracking Supervision [53.73237606312024]
PreViTS is an unsupervised SSL framework for selecting clips containing the same object. PreViTS spatially constrains the frame regions to learn from and trains the model to locate meaningful objects. We train a momentum contrastive (MoCo) encoder on VGG-Sound and Kinetics-400 datasets with PreViTS.
arXiv Detail & Related papers (2021-12-01T19:49:57Z)
CNN-based Omnidirectional Object Detection for HermesBot Autonomous Delivery Robot with Preliminary Frame Classification [53.56290185900837]
We propose an algorithm for optimizing a neural network for object detection using preliminary binary frame classification. An autonomous mobile robot with 6 rolling-shutter cameras on the perimeter providing a 360-degree field of view was used as the experimental setup.
arXiv Detail & Related papers (2021-10-22T15:05:37Z)
Moving Object Detection for Event-based vision using Graph Spectral Clustering [6.354824287948164]
Moving object detection has been a central topic of discussion in computer vision for its wide range of applications. We present an unsupervised Graph Spectral Clustering technique for Moving Object Detection in Event-based data. We additionally show how the optimum number of moving objects can be automatically determined.
arXiv Detail & Related papers (2021-09-30T10:19:22Z)
Self-supervised Video Object Segmentation by Motion Grouping [79.13206959575228]
We develop a computer vision system able to segment objects by exploiting motion cues. We introduce a simple variant of the Transformer to segment optical flow frames into primary objects and the background. We evaluate the proposed architecture on public benchmarks (DAVIS2016, SegTrackv2, and FBMS59)
arXiv Detail & Related papers (2021-04-15T17:59:32Z)
DS-Net: Dynamic Spatiotemporal Network for Video Salient Object Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information. We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.