MACP: Efficient Model Adaptation for Cooperative Perception
- URL: http://arxiv.org/abs/2310.16870v2
- Date: Tue, 7 Nov 2023 05:42:48 GMT
- Title: MACP: Efficient Model Adaptation for Cooperative Perception
- Authors: Yunsheng Ma and Juanwu Lu and Can Cui and Sicheng Zhao and Xu Cao and
Wenqian Ye and Ziran Wang
- Abstract summary: We propose a new framework termed MACP, which equips a single-agent pre-trained model with cooperation capabilities.
We demonstrate in experiments that the proposed framework can effectively utilize cooperative observations and outperform other state-of-the-art approaches.
- Score: 23.308578463976804
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Vehicle-to-vehicle (V2V) communications have greatly enhanced the perception
capabilities of connected and automated vehicles (CAVs) by enabling information
sharing to "see through the occlusions", resulting in significant performance
improvements. However, developing and training complex multi-agent perception
models from scratch can be expensive and unnecessary when existing single-agent
models show remarkable generalization capabilities. In this paper, we propose a
new framework termed MACP, which equips a single-agent pre-trained model with
cooperation capabilities. We approach this objective by identifying the key
challenges of shifting from single-agent to cooperative settings, adapting the
model by freezing most of its parameters and adding a few lightweight modules.
We demonstrate in our experiments that the proposed framework can effectively
utilize cooperative observations and outperform other state-of-the-art
approaches in both simulated and real-world cooperative perception benchmarks
while requiring substantially fewer tunable parameters with reduced
communication costs. Our source code is available at
https://github.com/PurdueDigitalTwin/MACP.
Related papers
- RG-Attn: Radian Glue Attention for Multi-modality Multi-agent Cooperative Perception [12.90369816793173]
Vehicle-to-Everything (V2X) communication offers an optimal solution to overcome the perception limitations of single-agent systems.
We propose two different architectures, named Paint-To-Puzzle (PTP) and Co-Sketching-Co-Co, for conducting cooperative perception.
Our approach achieves state-of-the-art (SOTA) performance on both real and simulated cooperative perception datasets.
arXiv Detail & Related papers (2025-01-28T09:08:31Z) - STAMP: Scalable Task And Model-agnostic Collaborative Perception [24.890993164334766]
STAMP is a task- and model-agnostic, collaborative perception pipeline for heterogeneous agents.
It minimizes computational overhead, enhances scalability, and preserves model security.
As a first-of-its-kind framework, STAMP aims to advance research in scalable and secure mobility systems towards Level 5 autonomy.
arXiv Detail & Related papers (2025-01-24T16:27:28Z) - Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision [44.5080084219247]
This paper introduces multimodal pre-training models and incorporates adaptive multi-objective optimization tailored to support both human visual perception and machine vision simultaneously with a single bitstream.
The proposed Unified and Generalized Image Coding for Machine (UG-ICM) is capable of achieving remarkable improvements in various unseen machine analytics tasks.
arXiv Detail & Related papers (2025-01-08T15:48:30Z) - Optimizing Small Language Models for In-Vehicle Function-Calling [4.148443557388842]
We propose a holistic approach for deploying Small Language Models (SLMs) as function-calling agents within vehicles as edge devices.
By leveraging SLMs, we simplify vehicle control mechanisms and enhance the user experience.
arXiv Detail & Related papers (2025-01-04T17:32:56Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Efficient Adaptive Human-Object Interaction Detection with
Concept-guided Memory [64.11870454160614]
We propose an efficient Adaptive HOI Detector with Concept-guided Memory (ADA-CM)
ADA-CM has two operating modes. The first mode makes it tunable without learning new parameters in a training-free paradigm.
Our proposed method achieves competitive results with state-of-the-art on the HICO-DET and V-COCO datasets with much less training time.
arXiv Detail & Related papers (2023-09-07T13:10:06Z) - An Empirical Study of Multimodal Model Merging [148.48412442848795]
Model merging is a technique that fuses multiple models trained on different tasks to generate a multi-task solution.
We conduct our study for a novel goal where we can merge vision, language, and cross-modal transformers of a modality-specific architecture.
We propose two metrics that assess the distance between weights to be merged and can serve as an indicator of the merging outcomes.
arXiv Detail & Related papers (2023-04-28T15:43:21Z) - eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception.
Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency.
We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z) - MoEfication: Conditional Computation of Transformer Models for Efficient
Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost.
We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon.
We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z) - Centralized Model and Exploration Policy for Multi-Agent RL [13.661446184763117]
Reinforcement learning in partially observable, fully cooperative multi-agent settings (Dec-POMDPs) can be used to address many real-world challenges.
Current RL algorithms for Dec-POMDPs suffer from poor sample complexity.
We propose a model-based algorithm, MARCO, in three cooperative communication tasks, where it improves sample efficiency by up to 20x.
arXiv Detail & Related papers (2021-07-14T00:34:08Z) - UPDeT: Universal Multi-agent Reinforcement Learning via Policy
Decoupling with Transformers [108.92194081987967]
We make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks.
Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy.
The proposed model, named as Universal Policy Decoupling Transformer (UPDeT), further relaxes the action restriction and makes the multi-agent task's decision process more explainable.
arXiv Detail & Related papers (2021-01-20T07:24:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.