Overcoming Obstructions via Bandwidth-Limited Multi-Agent Spatial
Handshaking
- URL: http://arxiv.org/abs/2107.00771v1
- Date: Thu, 1 Jul 2021 22:56:47 GMT
- Title: Overcoming Obstructions via Bandwidth-Limited Multi-Agent Spatial
Handshaking
- Authors: Nathaniel Glaser, Yen-Cheng Liu, Junjiao Tian, Zsolt Kira
- Abstract summary: We propose an end-to-end learn-able Multi-Agent Spatial Handshaking network (MASH) to process, compress, and propagate visual information across a robotic swarm.
Our method achieves an absolute 11% IoU improvement over strong baselines.
- Score: 37.866254392010454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we address bandwidth-limited and obstruction-prone
collaborative perception, specifically in the context of multi-agent semantic
segmentation. This setting presents several key challenges, including
processing and exchanging unregistered robotic swarm imagery. To be successful,
solutions must effectively leverage multiple non-static and
intermittently-overlapping RGB perspectives, while heeding bandwidth
constraints and overcoming unwanted foreground obstructions. As such, we
propose an end-to-end learn-able Multi-Agent Spatial Handshaking network (MASH)
to process, compress, and propagate visual information across a robotic swarm.
Our distributed communication module operates directly (and exclusively) on raw
image data, without additional input requirements such as pose, depth, or
warping data. We demonstrate superior performance of our model compared against
several baselines in a photo-realistic multi-robot AirSim environment,
especially in the presence of image occlusions. Our method achieves an absolute
11% IoU improvement over strong baselines.
Related papers
- Distributed NeRF Learning for Collaborative Multi-Robot Perception [16.353043979615496]
Multi-agent systems can offer a more comprehensive mapping of the environment, quicker coverage, and increased fault tolerance.
We propose a collaborative multi-agent perception system where agents collectively learn a neural radiance field (NeRF) from posed RGB images to represent a scene.
We show the effectiveness of our method through an extensive set of experiments on datasets containing challenging real-world scenes.
arXiv Detail & Related papers (2024-09-30T13:45:50Z) - Deep Generative Adversarial Network for Occlusion Removal from a Single Image [3.5639148953570845]
We propose a fully automatic, two-stage convolutional neural network for fence segmentation and occlusion completion.
We leverage generative adversarial networks (GANs) to synthesize realistic content, including both structure and texture, in a single shot for inpainting.
arXiv Detail & Related papers (2024-09-20T06:00:45Z) - Efficient Multi-scale Network with Learnable Discrete Wavelet Transform for Blind Motion Deblurring [25.36888929483233]
We propose a multi-scale network based on single-input and multiple-outputs(SIMO) for motion deblurring.
We combine the characteristics of real-world trajectories with a learnable wavelet transform module to focus on the directional continuity and frequency features of the step-by-step transitions between blurred images to sharp images.
arXiv Detail & Related papers (2023-12-29T02:59:40Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - Enhancing Multi-Robot Perception via Learned Data Association [37.866254392010454]
We address the multi-robot collaborative perception problem, specifically in the context of multi-view infilling for distributed semantic segmentation.
We propose the Multi-Agent Infilling Network: an neural architecture that can be deployed to each agent in a robotic swarm.
Specifically, each robot is in charge of locally encoding and decoding visual information, and an neural mechanism allows for an uncertainty-aware and context-based exchange of intermediate features.
arXiv Detail & Related papers (2021-07-01T22:45:26Z) - Deep Burst Super-Resolution [165.90445859851448]
We propose a novel architecture for the burst super-resolution task.
Our network takes multiple noisy RAW images as input, and generates a denoised, super-resolved RGB image as output.
In order to enable training and evaluation on real-world data, we additionally introduce the BurstSR dataset.
arXiv Detail & Related papers (2021-01-26T18:57:21Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.