People Tracking in Panoramic Video for Guiding Robots
- URL: http://arxiv.org/abs/2206.02735v1
- Date: Mon, 6 Jun 2022 16:44:38 GMT
- Title: People Tracking in Panoramic Video for Guiding Robots
- Authors: Alberto Bacchin, Filippo Berno, Emanuele Menegatti, and Alberto Pretto
- Abstract summary: A guiding robot aims to effectively bring people to and from specific places within environments that are possibly unknown to them.
During this operation the robot should be able to detect and track the accompanied person, trying never to lose sight of her/him.
A solution to minimize this event is to use an omnidirectional camera: its 360deg Field of View (FoV) guarantees that any framed object cannot leave the FoV if not occluded or very far from the sensor.
We propose a set of targeted methods that allow to effectively adapt to panoramic videos a standard people detection and tracking pipeline originally designed for perspective cameras
- Score: 2.092922495279074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A guiding robot aims to effectively bring people to and from specific places
within environments that are possibly unknown to them. During this operation
the robot should be able to detect and track the accompanied person, trying
never to lose sight of her/him. A solution to minimize this event is to use an
omnidirectional camera: its 360{\deg} Field of View (FoV) guarantees that any
framed object cannot leave the FoV if not occluded or very far from the sensor.
However, the acquired panoramic videos introduce new challenges in perception
tasks such as people detection and tracking, including the large size of the
images to be processed, the distortion effects introduced by the cylindrical
projection and the periodic nature of panoramic images. In this paper, we
propose a set of targeted methods that allow to effectively adapt to panoramic
videos a standard people detection and tracking pipeline originally designed
for perspective cameras. Our methods have been implemented and tested inside a
deep learning-based people detection and tracking framework with a commercial
360{\deg} camera. Experiments performed on datasets specifically acquired for
guiding robot applications and on a real service robot show the effectiveness
of the proposed approach over other state-of-the-art systems. We release with
this paper the acquired and annotated datasets and the open-source
implementation of our method.
Related papers
- ChatCam: Empowering Camera Control through Conversational AI [67.31920821192323]
ChatCam is a system that navigates camera movements through conversations with users.
To achieve this, we propose CineGPT, a GPT-based autoregressive model for text-conditioned camera trajectory generation.
We also develop an Anchor Determinator to ensure precise camera trajectory placement.
arXiv Detail & Related papers (2024-09-25T20:13:41Z) - Analysis of Unstructured High-Density Crowded Scenes for Crowd Monitoring [55.2480439325792]
We are interested in developing an automated system for detection of organized movements in human crowds.
Computer vision algorithms can extract information from videos of crowded scenes.
We can estimate the number of participants in an organized cohort.
arXiv Detail & Related papers (2024-08-06T22:09:50Z) - Vision-based Manipulation from Single Human Video with Open-World Object Graphs [58.23098483464538]
We present an object-centric approach to empower robots to learn vision-based manipulation skills from human videos.
We introduce ORION, an algorithm that tackles the problem by extracting an object-centric manipulation plan from a single RGB-D video.
arXiv Detail & Related papers (2024-05-30T17:56:54Z) - PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a Mobile Robot [3.387892563308912]
We introduce a novel approach to process a sequence of dynamic successive frames in a line-of-sight (LOS) video using an attention-based neural network.
We validate the approach on in-the-wild scenes using a drone for video capture, thus demonstrating low-cost NLOS imaging in dynamic capture environments.
arXiv Detail & Related papers (2024-04-07T17:31:53Z) - Learning Video-Conditioned Policies for Unseen Manipulation Tasks [83.2240629060453]
Video-conditioned Policy learning maps human demonstrations of previously unseen tasks to robot manipulation skills.
We learn our policy to generate appropriate actions given current scene observations and a video of the target task.
We validate our approach on a set of challenging multi-task robot manipulation environments and outperform state of the art.
arXiv Detail & Related papers (2023-05-10T16:25:42Z) - Estimation of Appearance and Occupancy Information in Birds Eye View
from Surround Monocular Images [2.69840007334476]
Birds-eye View (BEV) expresses the location of different traffic participants in the ego vehicle frame from a top-down view.
We propose a novel representation that captures various traffic participants appearance and occupancy information from an array of monocular cameras covering 360 deg field of view (FOV)
We use a learned image embedding of all camera images to generate a BEV of the scene at any instant that captures both appearance and occupancy of the scene.
arXiv Detail & Related papers (2022-11-08T20:57:56Z) - Incremental 3D Scene Completion for Safe and Efficient Exploration
Mapping and Planning [60.599223456298915]
We propose a novel way to integrate deep learning into exploration by leveraging 3D scene completion for informed, safe, and interpretable mapping and planning.
We show that our method can speed up coverage of an environment by 73% compared to the baselines with only minimal reduction in map accuracy.
Even if scene completions are not included in the final map, we show that they can be used to guide the robot to choose more informative paths, speeding up the measurement of the scene with the robot's sensors by 35%.
arXiv Detail & Related papers (2022-08-17T14:19:33Z) - Three-dimensional Human Tracking of a Mobile Robot by Fusion of Tracking
Results of Two Cameras [0.860255319568951]
OpenPose is used for human detection.
A new stereo vision framework is proposed to cope with the problems.
The effectiveness of the proposed framework and the method is verified through target-tracking experiments.
arXiv Detail & Related papers (2020-07-03T06:46:49Z) - One-Shot Informed Robotic Visual Search in the Wild [29.604267552742026]
We consider the task of underwater robot navigation for the purpose of collecting scientifically relevant video data for environmental monitoring.
The majority of field robots currently perform monitoring tasks in unstructured natural environments navigate via path-tracking a pre-specified sequence of waypoints.
We propose a method that enables informed visual navigation via a learned visual similarity operator that guides the robot's visual search towards parts of the scene that look like exemplar images.
arXiv Detail & Related papers (2020-03-22T22:14:42Z) - GhostImage: Remote Perception Attacks against Camera-based Image
Classification Systems [6.637193297008101]
In vision-based object classification systems imaging sensors perceive the environment and machine learning is then used to detect and classify objects for decision-making purposes.
We demonstrate how the perception domain can be remotely and unobtrusively exploited to enable an attacker to create spurious objects or alter an existing object.
arXiv Detail & Related papers (2020-01-21T21:58:45Z) - Morphology-Agnostic Visual Robotic Control [76.44045983428701]
MAVRIC is an approach that works with minimal prior knowledge of the robot's morphology.
We demonstrate our method on visually-guided 3D point reaching, trajectory following, and robot-to-robot imitation.
arXiv Detail & Related papers (2019-12-31T15:45:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.