Do You See What I See? Coordinating Multiple Aerial Cameras for Robot
Cinematography
- URL: http://arxiv.org/abs/2011.05437v2
- Date: Wed, 31 Mar 2021 22:02:07 GMT
- Title: Do You See What I See? Coordinating Multiple Aerial Cameras for Robot
Cinematography
- Authors: Arthur Bucker, Rogerio Bonatti and Sebastian Scherer
- Abstract summary: We develop a real-time multi-UAV coordination system that is capable of recording dynamic targets while maximizing shot diversity and avoiding collisions.
We show that our coordination scheme has low computational cost and takes only 1.17 ms on average to plan for a team of 3 UAVs over a 10 s time horizon.
- Score: 9.870369982132678
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Aerial cinematography is significantly expanding the capabilities of
film-makers. Recent progress in autonomous unmanned aerial vehicles (UAVs) has
further increased the potential impact of aerial cameras, with systems that can
safely track actors in unstructured cluttered environments. Professional
productions, however, require the use of multiple cameras simultaneously to
record different viewpoints of the same scene, which are edited into the final
footage either in real time or in post-production. Such extreme motion
coordination is particularly hard for unscripted action scenes, which are a
common use case of aerial cameras. In this work we develop a real-time
multi-UAV coordination system that is capable of recording dynamic targets
while maximizing shot diversity and avoiding collisions and mutual visibility
between cameras. We validate our approach in multiple cluttered environments of
a photo-realistic simulator, and deploy the system using two UAVs in real-world
experiments. We show that our coordination scheme has low computational cost
and takes only 1.17 ms on average to plan for a team of 3 UAVs over a 10 s time
horizon. Supplementary video: https://youtu.be/m2R3anv2ADE
Related papers
- Investigating Event-Based Cameras for Video Frame Interpolation in Sports [59.755469098797406]
We present a first investigation of event-based Video Frame Interpolation (VFI) models for generating sports slow-motion videos.
Particularly, we design and implement a bi-camera recording setup, including an RGB and an event-based camera to capture sports videos, to temporally align and spatially register both cameras.
Our experimental validation demonstrates that TimeLens, an off-the-shelf event-based VFI model, can effectively generate slow-motion footage for sports videos.
arXiv Detail & Related papers (2024-07-02T15:39:08Z) - Image Conductor: Precision Control for Interactive Video Synthesis [90.2353794019393]
Filmmaking and animation production often require sophisticated techniques for coordinating camera transitions and object movements.
Image Conductor is a method for precise control of camera transitions and object movements to generate video assets from a single image.
arXiv Detail & Related papers (2024-06-21T17:55:05Z) - Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception? [57.77643186237265]
We present Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record synchronized scenes from different perspectives.
MAVREC consists of around 2.5 hours of industry-standard 2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million annotated bounding boxes.
This makes MAVREC the largest ground and aerial-view dataset, and the fourth largest among all drone-based datasets.
arXiv Detail & Related papers (2023-12-07T18:59:14Z) - Automatic Camera Trajectory Control with Enhanced Immersion for Virtual Cinematography [23.070207691087827]
Real-world cinematographic rules show that directors can create immersion by comprehensively synchronizing the camera with the actor.
Inspired by this strategy, we propose a deep camera control framework that enables actor-camera synchronization in three aspects.
Our proposed method yields immersive cinematic videos of high quality, both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-03-29T22:02:15Z) - Learning Active Camera for Multi-Object Navigation [94.89618442412247]
Getting robots to navigate to multiple objects autonomously is essential yet difficult in robot applications.
Existing navigation methods mainly focus on fixed cameras and few attempts have been made to navigate with active cameras.
In this paper, we consider navigating to multiple objects more efficiently with active cameras.
arXiv Detail & Related papers (2022-10-14T04:17:30Z) - AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape
Estimation [51.17610485589701]
We present a novel markerless 3D human motion capture (MoCap) system for unstructured, outdoor environments.
AirPose estimates human pose and shape using images captured by multiple uncalibrated flying cameras.
AirPose itself calibrates the cameras relative to the person instead of relying on any pre-calibration.
arXiv Detail & Related papers (2022-01-20T09:46:20Z) - 3D Human Reconstruction in the Wild with Collaborative Aerial Cameras [3.3674370488883434]
We present a real-time aerial system for multi-camera control that can reconstruct human motions in natural environments without the use of special-purpose markers.
We develop a multi-robot coordination scheme that maintains the optimal flight formation for target reconstruction quality amongst obstacles.
arXiv Detail & Related papers (2021-08-09T11:03:38Z) - On the Advantages of Multiple Stereo Vision Camera Designs for
Autonomous Drone Navigation [7.299239909796724]
We showcase the performance of a multi-camera UAV, when coupled with state-of-the-art planning and mapping algorithms.
We employ our approaches in an autonomous drone-based inspection task and evaluate them in an autonomous exploration and mapping scenario.
arXiv Detail & Related papers (2021-05-26T17:10:20Z) - Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos [15.244418294614857]
We design a UAV system with a Panoramic Annular Lens (PAL), which has the characteristics of small size, low weight, and a 360-degree annular FoV.
A lightweight panoramic annular semantic segmentation neural network model is designed to achieve high-accuracy and real-time scene parsing.
A comprehensive variety of experiments shows that the designed system performs satisfactorily in aerial panoramic scene parsing.
arXiv Detail & Related papers (2021-05-15T12:01:16Z) - Deep Multimodality Learning for UAV Video Aesthetic Quality Assessment [22.277636020333198]
We present a method of deep multimodality learning for UAV video aesthetic quality assessment.
A novel specially designed motion stream network is proposed for this new multistream framework.
We present three application examples of UAV video grading, professional segment detection and aesthetic-based UAV path planning.
arXiv Detail & Related papers (2020-11-04T15:37:49Z) - Multi-Drone based Single Object Tracking with Agent Sharing Network [74.8198920355117]
Multi-Drone single Object Tracking dataset consists of 92 groups of video clips with 113,918 high resolution frames taken by two drones and 63 groups of video clips with 145,875 high resolution frames taken by three drones.
Agent sharing network (ASNet) is proposed by self-supervised template sharing and view-aware fusion of the target from multiple drones.
arXiv Detail & Related papers (2020-03-16T03:27:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.