Practical Collaborative Perception: A Framework for Asynchronous and
Multi-Agent 3D Object Detection
- URL: http://arxiv.org/abs/2307.01462v3
- Date: Tue, 19 Sep 2023 07:45:52 GMT
- Title: Practical Collaborative Perception: A Framework for Asynchronous and
Multi-Agent 3D Object Detection
- Authors: Minh-Quan Dao, Julie Stephany Berrio, Vincent Fr\'emont, Mao Shan,
Elwan H\'ery, and Stewart Worrall
- Abstract summary: Occlusion is a major challenge for LiDAR-based object detection methods.
State-of-the-art V2X methods resolve the performance-bandwidth tradeoff using a mid-collaboration approach.
We devise a simple yet effective collaboration method that achieves a better bandwidth-performance tradeoff than prior methods.
- Score: 9.967263440745432
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Occlusion is a major challenge for LiDAR-based object detection methods. This
challenge becomes safety-critical in urban traffic where the ego vehicle must
have reliable object detection to avoid collision while its field of view is
severely reduced due to the obstruction posed by a large number of road users.
Collaborative perception via Vehicle-to-Everything (V2X) communication, which
leverages the diverse perspective thanks to the presence at multiple locations
of connected agents to form a complete scene representation, is an appealing
solution. State-of-the-art V2X methods resolve the performance-bandwidth
tradeoff using a mid-collaboration approach where the Bird-Eye View images of
point clouds are exchanged so that the bandwidth consumption is lower than
communicating point clouds as in early collaboration, and the detection
performance is higher than late collaboration, which fuses agents' output,
thanks to a deeper interaction among connected agents. While achieving strong
performance, the real-world deployment of most mid-collaboration approaches is
hindered by their overly complicated architectures, involving learnable
collaboration graphs and autoencoder-based compressor/ decompressor, and
unrealistic assumptions about inter-agent synchronization. In this work, we
devise a simple yet effective collaboration method that achieves a better
bandwidth-performance tradeoff than prior state-of-the-art methods while
minimizing changes made to the single-vehicle detection models and relaxing
unrealistic assumptions on inter-agent synchronization. Experiments on the
V2X-Sim dataset show that our collaboration method achieves 98\% of the
performance of an early-collaboration method, while only consuming the
equivalent bandwidth of a late-collaboration method.
Related papers
- Semantic Communication for Cooperative Perception using HARQ [51.148203799109304]
We leverage an importance map to distill critical semantic information, introducing a cooperative perception semantic communication framework.
To counter the challenges posed by time-varying multipath fading, our approach incorporates the use of frequency-division multiplexing (OFDM) along with channel estimation and equalization strategies.
We introduce a novel semantic error detection method that is integrated with our semantic communication framework in the spirit of hybrid automatic repeated request (HARQ)
arXiv Detail & Related papers (2024-08-29T08:53:26Z) - HEAD: A Bandwidth-Efficient Cooperative Perception Approach for Heterogeneous Connected and Autonomous Vehicles [9.10239345027499]
HEAD is a method that fuses features from the classification and regression heads in 3D object detection networks.
Our experiments demonstrate that HEAD is a fusion method that effectively balances communication bandwidth and perception performance.
arXiv Detail & Related papers (2024-08-27T22:05:44Z) - RoCo:Robust Collaborative Perception By Iterative Object Matching and Pose Adjustment [9.817492112784674]
Collaborative autonomous driving with multiple vehicles usually requires the data fusion from multiple modalities.
In collaborative perception, the quality of object detection based on a modality is highly sensitive to the relative pose errors among the agents.
We propose RoCo, a novel unsupervised framework to conduct iterative object matching and agent pose adjustment.
arXiv Detail & Related papers (2024-08-01T03:29:33Z) - Self-Localized Collaborative Perception [49.86110931859302]
We propose$mathttCoBEVGlue$, a novel self-localized collaborative perception system.
$mathttCoBEVGlue$ is a novel spatial alignment module, which provides the relative poses between agents.
$mathttCoBEVGlue$ achieves state-of-the-art detection performance under arbitrary localization noises and attacks.
arXiv Detail & Related papers (2024-06-18T15:26:54Z) - Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z) - Attention Based Feature Fusion For Multi-Agent Collaborative Perception [4.120288148198388]
We propose an intermediate collaborative perception solution in the form of a graph attention network (GAT)
The proposed approach develops an attention-based aggregation strategy to fuse intermediate representations exchanged among multiple connected agents.
This approach adaptively highlights important regions in the intermediate feature maps at both the channel and spatial levels, resulting in improved object detection precision.
arXiv Detail & Related papers (2023-05-03T12:06:11Z) - Interruption-Aware Cooperative Perception for V2X Communication-Aided
Autonomous Driving [49.42873226593071]
We propose V2X communication INterruption-aware COoperative Perception (V2X-INCOP) for V2X communication-aided autonomous driving.
We use historical cooperation information to recover missing information due to the interruptions and alleviate the impact of the interruption issue.
Experiments on three public cooperative perception datasets demonstrate that the proposed method is effective in alleviating the impacts of communication interruption on cooperative perception.
arXiv Detail & Related papers (2023-04-24T04:59:13Z) - DOAD: Decoupled One Stage Action Detection Network [77.14883592642782]
Localizing people and recognizing their actions from videos is a challenging task towards high-level video understanding.
Existing methods are mostly two-stage based, with one stage for person bounding box generation and the other stage for action recognition.
We present a decoupled one-stage network dubbed DOAD, to improve the efficiency for-temporal action detection.
arXiv Detail & Related papers (2023-04-01T08:06:43Z) - CoPEM: Cooperative Perception Error Models for Autonomous Driving [20.60246432605745]
We focus on the (onboard) perception of Autonomous Vehicles (AV), which can manifest as misdetection errors on the occluded objects.
We introduce the notion of Cooperative Perception Error Models (coPEMs) towards achieving an effective integration of V2X solutions within a virtual test environment.
arXiv Detail & Related papers (2022-11-21T04:40:27Z) - Online V2X Scheduling for Raw-Level Cooperative Perception [21.099819062731463]
Cooperative perception of connected vehicles comes to the rescue when the field of view restricts stand-alone intelligence.
We present a model of raw-level cooperative perception and formulate the energy minimization problem of sensor sharing scheduling.
We propose an online learning-based algorithm with logarithmic performance loss, achieving a decent trade-off between exploration and exploitation.
arXiv Detail & Related papers (2022-02-12T15:16:45Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.