Self-Localized Collaborative Perception
- URL: http://arxiv.org/abs/2406.12712v1
- Date: Tue, 18 Jun 2024 15:26:54 GMT
- Title: Self-Localized Collaborative Perception
- Authors: Zhenyang Ni, Zixing Lei, Yifan Lu, Dingju Wang, Chen Feng, Yanfeng Wang, Siheng Chen,
- Abstract summary: We propose$mathttCoBEVGlue$, a novel self-localized collaborative perception system.
$mathttCoBEVGlue$ is a novel spatial alignment module, which provides the relative poses between agents.
$mathttCoBEVGlue$ achieves state-of-the-art detection performance under arbitrary localization noises and attacks.
- Score: 49.86110931859302
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Collaborative perception has garnered considerable attention due to its capacity to address several inherent challenges in single-agent perception, including occlusion and out-of-range issues. However, existing collaborative perception systems heavily rely on precise localization systems to establish a consistent spatial coordinate system between agents. This reliance makes them susceptible to large pose errors or malicious attacks, resulting in substantial reductions in perception performance. To address this, we propose~$\mathtt{CoBEVGlue}$, a novel self-localized collaborative perception system, which achieves more holistic and robust collaboration without using an external localization system. The core of~$\mathtt{CoBEVGlue}$ is a novel spatial alignment module, which provides the relative poses between agents by effectively matching co-visible objects across agents. We validate our method on both real-world and simulated datasets. The results show that i) $\mathtt{CoBEVGlue}$ achieves state-of-the-art detection performance under arbitrary localization noises and attacks; and ii) the spatial alignment module can seamlessly integrate with a majority of previous methods, enhancing their performance by an average of $57.7\%$. Code is available at https://github.com/VincentNi0107/CoBEVGlue
Related papers
- Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling [3.396870608435494]
We study a cooperative Markov game with a global agent and $n$ homogeneous local agents in a communication-constrained regime.<n>We prove that these approximate best-response dynamics converge to an $widetildeO (1/sqrtk)$-approximate Nash Equilibrium.
arXiv Detail & Related papers (2026-03-04T06:14:24Z) - Robust Scene Coordinate Regression via Geometrically-Consistent Global Descriptors [52.57327385675752]
We propose an aggregator module that learns global descriptors consistent with both geometrical structure and visual similarity.<n>This corrects erroneous associations caused by unreliable overlap scores.<n>Experiments on challenging benchmarks show substantial localization gains in large-scale environments.
arXiv Detail & Related papers (2025-12-19T04:24:03Z) - UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction [83.48950950780554]
Building extraction from remote sensing images is a challenging task due to the complex structure variations of buildings.<n>Existing methods employ convolutional or self-attention blocks to capture the multi-scale features in the segmentation models.<n>We present an Uncertainty-Aggregated Global-Local Fusion Network (UAGLNet) to exploit high-quality global-local visual semantics.
arXiv Detail & Related papers (2025-12-15T02:59:16Z) - RoCo:Robust Collaborative Perception By Iterative Object Matching and Pose Adjustment [9.817492112784674]
Collaborative autonomous driving with multiple vehicles usually requires the data fusion from multiple modalities.
In collaborative perception, the quality of object detection based on a modality is highly sensitive to the relative pose errors among the agents.
We propose RoCo, a novel unsupervised framework to conduct iterative object matching and agent pose adjustment.
arXiv Detail & Related papers (2024-08-01T03:29:33Z) - Robust Collaborative Perception without External Localization and Clock Devices [52.32342059286222]
A consistent spatial-temporal coordination across multiple agents is fundamental for collaborative perception.
Traditional methods depend on external devices to provide localization and clock signals.
We propose a novel approach: aligning by recognizing the inherent geometric patterns within the perceptual data of various agents.
arXiv Detail & Related papers (2024-05-05T15:20:36Z) - ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios.
We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out.
Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z) - Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs [49.71319907864573]
In this paper, we propose multi-agent skill discovery which enables the ease of decomposition.
Our key idea is to approximate the joint state space as a Kronecker graph, based on which we can directly estimate its Fiedler vector.
Considering that directly computing the Laplacian spectrum is intractable for tasks with infinite-scale state spaces, we further propose a deep learning extension of our method.
arXiv Detail & Related papers (2023-07-21T14:53:12Z) - Supervision Interpolation via LossMix: Generalizing Mixup for Object
Detection and Beyond [10.25372189905226]
LossMix is a simple yet versatile and effective regularization that enhances the performance and robustness of object detectors.
Empirical results on the PASCAL VOC and MS COCO datasets demonstrate that LossMix can consistently outperform state-of-the-art methods for detection.
arXiv Detail & Related papers (2023-03-18T06:13:30Z) - Robust Collaborative 3D Object Detection in Presence of Pose Errors [31.039703988342243]
Collaborative 3D object detection exploits information exchange among multiple agents to enhance accuracy.
In practice, pose estimation errors due to imperfect localization would cause spatial message misalignment.
We propose CoAlign, a novel hybrid collaboration framework that is robust to unknown pose errors.
arXiv Detail & Related papers (2022-11-14T09:11:14Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Ego-Motion Alignment from Face Detections for Collaborative Augmented
Reality [5.33024001730262]
We show that detecting each other's face or glasses together with tracker ego-poses sufficiently conditions the problem to spatially relate local coordinate systems.
The detected glasses can serve as reliable anchors to bring sufficient accuracy for the targeted practical use.
arXiv Detail & Related papers (2020-10-05T16:57:48Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.