Software-Hardware Co-optimization for Modular E2E AV Paradigm: A Unified Framework of Optimization Approaches, Simulation Environment and Evaluation Metrics
- URL: http://arxiv.org/abs/2601.07393v1
- Date: Mon, 12 Jan 2026 10:22:50 GMT
- Title: Software-Hardware Co-optimization for Modular E2E AV Paradigm: A Unified Framework of Optimization Approaches, Simulation Environment and Evaluation Metrics
- Authors: Chengzhi Ji, Xingfeng Li, Zhaodong Lv, Hao Sun, Pan Liu, Hao Frank Yang, Ziyuan Pu,
- Abstract summary: This paper proposes a reusable software and hardware co-optimization and closed-loop evaluation framework for ME2E autonomous driving inference.<n>The proposed framework preserves baseline-level driving performance while significantly reducing inference latency and energy consumption.
- Score: 21.03304462504213
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modular end-to-end (ME2E) autonomous driving paradigms combine modular interpretability with global optimization capability and have demonstrated strong performance. However, existing studies mainly focus on accuracy improvement, while critical system-level factors such as inference latency and energy consumption are often overlooked, resulting in increasingly complex model designs that hinder practical deployment. Prior efforts on model compression and acceleration typically optimize either the software or hardware side in isolation. Software-only optimization cannot fundamentally remove intermediate tensor access and operator scheduling overheads, whereas hardware-only optimization is constrained by model structure and precision. As a result, the real-world benefits of such optimizations are often limited. To address these challenges, this paper proposes a reusable software and hardware co-optimization and closed-loop evaluation framework for ME2E autonomous driving inference. The framework jointly integrates software-level model optimization with hardware-level computation optimization under a unified system-level objective. In addition, a multidimensional evaluation metric is introduced to assess system performance by jointly considering safety, comfort, efficiency, latency, and energy, enabling quantitative comparison of different optimization strategies. Experiments across multiple ME2E autonomous driving stacks show that the proposed framework preserves baseline-level driving performance while significantly reducing inference latency and energy consumption, achieving substantial overall system-level improvements. These results demonstrate that the proposed framework provides practical and actionable guidance for efficient deployment of ME2E autonomous driving systems.
Related papers
- AutoSizer: Automatic Sizing of Analog and Mixed-Signal Circuits via Large Language Model (LLM) Agents [4.28109763423665]
AutoSizer is a meta-optimization framework that unifies circuit understanding, adaptive search-space construction, and optimization orchestration in a closed loop.<n>It experimentally achieves higher solution quality, faster convergence, and higher success rate across varying circuit difficulties.
arXiv Detail & Related papers (2026-02-02T21:51:55Z) - Relation-Aware Bayesian Optimization of DBMS Configurations Guided by Affinity Scores [2.474203056060563]
Database Management Systems (DBMSs) are fundamental for managing large-scale and heterogeneous data, and their performance is critically influenced by configuration parameters.<n>Recent research has focused on automated configuration optimization using machine learning; however, existing approaches still exhibit several key limitations.<n>We propose RelTune, a novel framework that represents parameter dependencies as a Graph and learns GNN-based latent embeddings that encode performancerelevant semantics.
arXiv Detail & Related papers (2025-10-31T03:46:42Z) - AwareCompiler: Agentic Context-Aware Compiler Optimization via a Synergistic Knowledge-Data Driven Framework [42.57224438231615]
This paper introduces textbfAwareCompiler, an agentic framework for compiler optimization.<n>Three key innovations: structured knowledge integration and dataset construction, knowledge-driven adaptive pass generation, and data-driven hybrid training pipeline.<n> Experimental results on standard benchmarks demonstrate that AwareCompiler significantly outperforms existing baselines in both performance and efficiency.
arXiv Detail & Related papers (2025-10-13T02:02:36Z) - A Modular Algorithm for Non-Stationary Online Convex-Concave Optimization [10.66415966839617]
The goal is to minimize the dynamic duality gap (D-DGap), a critical performance measure.<n>Existing algorithms fail to deliver optimal performance, particularly in stationary or predictable environments.<n>Our algorithm achieves a minimax optimal D-DGap upper bound, up to a logarithmic factor, while also ensuring prediction error-driven D-DGap bounds.
arXiv Detail & Related papers (2025-09-09T16:33:38Z) - Reflection-Enhanced Meta-Optimization Integrating TextGrad-style Prompt Optimization with Memory-Driven Self-Evolution [0.0]
We propose a framework that integrates a memory-augmented Reflection RetrievalRAG module and a Self-Adaptive meta-controller.<n>REMO achieves more stable and robust tuning, albeit at the cost of increased computational overhead.
arXiv Detail & Related papers (2025-08-26T07:25:45Z) - LLM4CMO: Large Language Model-aided Algorithm Design for Constrained Multiobjective Optimization [54.35609820607923]
Large language models (LLMs) offer new opportunities for assisting with algorithm design.<n>We propose LLM4CMO, a novel CMOEA based on a dual-population, two-stage framework.<n>LLMs can serve as efficient co-designers in the development of complex evolutionary optimization algorithms.
arXiv Detail & Related papers (2025-08-16T02:00:57Z) - Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards [98.0983122471875]
We propose Optimas, a unified framework for effective optimization of compound systems.<n>In each iteration, Optimas efficiently adapts the Local Reward Function (LRF) to maintain this property while simultaneously maximizing each component's local reward.<n>We present extensive evaluations across five real-world compound systems to demonstrate that Optimas outperforms strong baselines by an average improvement of 11.92%.
arXiv Detail & Related papers (2025-07-03T07:12:48Z) - SOLVE: Synergy of Language-Vision and End-to-End Networks for Autonomous Driving [51.47621083057114]
SOLVE is an innovative framework that synergizes Vision-Language Models with end-to-end (E2E) models to enhance autonomous vehicle planning.<n>Our approach emphasizes knowledge sharing at the feature level through a shared visual encoder, enabling comprehensive interaction between VLM and E2E components.
arXiv Detail & Related papers (2025-05-22T15:44:30Z) - A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.<n> deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.<n>This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z) - Transformer-based Model Predictive Control: Trajectory Optimization via Sequence Modeling [16.112708478263745]
We present a unified framework combine the main strengths of optimization-based methods for learning.
Our approach entails embedding high-capacity, transformer-based neural network models within optimization process.
Compared to purely optimization-based approaches, results show that our approach can improve performance by up to 75%.
arXiv Detail & Related papers (2024-10-31T13:23:10Z) - DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving [55.53171248839489]
We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving.<n>Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner.<n>Experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.
arXiv Detail & Related papers (2024-09-15T15:55:24Z) - Accelerated Federated Learning with Decoupled Adaptive Optimization [53.230515878096426]
federated learning (FL) framework enables clients to collaboratively learn a shared model while keeping privacy of training data on clients.
Recently, many iterations efforts have been made to generalize centralized adaptive optimization methods, such as SGDM, Adam, AdaGrad, etc., to federated settings.
This work aims to develop novel adaptive optimization methods for FL from the perspective of dynamics of ordinary differential equations (ODEs)
arXiv Detail & Related papers (2022-07-14T22:46:43Z) - Optimization-Inspired Learning with Architecture Augmentations and
Control Mechanisms for Low-Level Vision [74.9260745577362]
This paper proposes a unified optimization-inspired learning framework to aggregate Generative, Discriminative, and Corrective (GDC) principles.
We construct three propagative modules to effectively solve the optimization models with flexible combinations.
Experiments across varied low-level vision tasks validate the efficacy and adaptability of GDC.
arXiv Detail & Related papers (2020-12-10T03:24:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.