Related papers: PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a Mobile Robot

PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a Mobile Robot

URL: http://arxiv.org/abs/2404.05024v1
Date: Sun, 7 Apr 2024 17:31:53 GMT
Title: PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a Mobile Robot
Authors: Shenbagaraj Kannapiran, Sreenithy Chandran, Suren Jayasuriya, Spring Berman,
Abstract summary: We introduce a novel approach to process a sequence of dynamic successive frames in a line-of-sight (LOS) video using an attention-based neural network. We validate the approach on in-the-wild scenes using a drone for video capture, thus demonstrating low-cost NLOS imaging in dynamic capture environments.
Score: 3.387892563308912
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The study of non-line-of-sight (NLOS) imaging is growing due to its many potential applications, including rescue operations and pedestrian detection by self-driving cars. However, implementing NLOS imaging on a moving camera remains an open area of research. Existing NLOS imaging methods rely on time-resolved detectors and laser configurations that require precise optical alignment, making it difficult to deploy them in dynamic environments. This work proposes a data-driven approach to NLOS imaging, PathFinder, that can be used with a standard RGB camera mounted on a small, power-constrained mobile robot, such as an aerial drone. Our experimental pipeline is designed to accurately estimate the 2D trajectory of a person who moves in a Manhattan-world environment while remaining hidden from the camera's field-of-view. We introduce a novel approach to process a sequence of dynamic successive frames in a line-of-sight (LOS) video using an attention-based neural network that performs inference in real-time. The method also includes a preprocessing selection metric that analyzes images from a moving camera which contain multiple vertical planar surfaces, such as walls and building facades, and extracts planes that return maximum NLOS information. We validate the approach on in-the-wild scenes using a drone for video capture, thus demonstrating low-cost NLOS imaging in dynamic capture environments.

Related papers

A Cross-Scene Benchmark for Open-World Drone Active Tracking [54.235808061746525]
Drone Visual Active Tracking aims to autonomously follow a target object by controlling the motion system based on visual observations. We propose a unified cross-scene cross-domain benchmark for open-world drone active tracking called DAT. We also propose a reinforcement learning-based drone tracking method called R-VAT.
arXiv Detail & Related papers (2024-12-01T09:37:46Z)
Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner. Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping. Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z)
Propagate And Calibrate: Real-time Passive Non-line-of-sight Tracking [84.38335117043907]
We propose a purely passive method to track a person walking in an invisible room by only observing a relay wall. To excavate imperceptible changes in videos of the relay wall, we introduce difference frames as an essential carrier of temporal-local motion messages. To evaluate the proposed method, we build and publish the first dynamic passive NLOS tracking dataset, NLOS-Track.
arXiv Detail & Related papers (2023-03-21T12:18:57Z)
Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments. Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity. We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z)
People Tracking in Panoramic Video for Guiding Robots [2.092922495279074]
A guiding robot aims to effectively bring people to and from specific places within environments that are possibly unknown to them. During this operation the robot should be able to detect and track the accompanied person, trying never to lose sight of her/him. A solution to minimize this event is to use an omnidirectional camera: its 360deg Field of View (FoV) guarantees that any framed object cannot leave the FoV if not occluded or very far from the sensor. We propose a set of targeted methods that allow to effectively adapt to panoramic videos a standard people detection and tracking pipeline originally designed for perspective cameras
arXiv Detail & Related papers (2022-06-06T16:44:38Z)
Cross-Camera Trajectories Help Person Retrieval in a Camera Network [124.65912458467643]
Existing methods often rely on purely visual matching or consider temporal constraints but ignore the spatial information of the camera network. We propose a pedestrian retrieval framework based on cross-camera generation, which integrates both temporal and spatial information. To verify the effectiveness of our method, we construct the first cross-camera pedestrian trajectory dataset.
arXiv Detail & Related papers (2022-04-27T13:10:48Z)
EagerMOT: 3D Multi-Object Tracking via Sensor Fusion [68.8204255655161]
Multi-object tracking (MOT) enables mobile robots to perform well-informed motion planning and navigation by localizing surrounding objects in 3D space and time. Existing methods rely on depth sensors (e.g., LiDAR) to detect and track targets in 3D space, but only up to a limited sensing range due to the sparsity of the signal. We propose EagerMOT, a simple tracking formulation that integrates all available object observations from both sensor modalities to obtain a well-informed interpretation of the scene dynamics.
arXiv Detail & Related papers (2021-04-29T22:30:29Z)
D-NeRF: Neural Radiance Fields for Dynamic Scenes [72.75686949608624]
We introduce D-NeRF, a method that extends neural radiance fields to a dynamic domain. D-NeRF reconstructs images of objects under rigid and non-rigid motions from a camera moving around the scene. We demonstrate the effectiveness of our approach on scenes with objects under rigid, articulated and non-rigid motions.
arXiv Detail & Related papers (2020-11-27T19:06:50Z)
Batteries, camera, action! Learning a semantic control space for expressive robot cinematography [15.895161373307378]
We develop a data-driven framework that enables editing of complex camera positioning parameters in a semantic space. First, we generate a database of video clips with a diverse range of shots in a photo-realistic simulator. We use hundreds of participants in a crowd-sourcing framework to obtain scores for a set of semantic descriptors for each clip.
arXiv Detail & Related papers (2020-11-19T21:56:53Z)
Camera-Based Adaptive Trajectory Guidance via Neural Networks [6.2843107854856965]
We introduce a novel method to capture visual trajectories for navigating an indoor robot in dynamic settings using streaming image data. The captured trajectories are used to design, train, and compare two neural network architectures for predicting acceleration and steering commands for a line following robot over a continuous space in real time.
arXiv Detail & Related papers (2020-01-09T20:05:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.