Related papers: Chart2Code-MoLA: Efficient Multi-Modal Code Generation via Adaptive Expert Routing

Chart2Code-MoLA: Efficient Multi-Modal Code Generation via Adaptive Expert Routing

URL: http://arxiv.org/abs/2511.23321v1
Date: Fri, 28 Nov 2025 16:23:04 GMT
Title: Chart2Code-MoLA: Efficient Multi-Modal Code Generation via Adaptive Expert Routing
Authors: Yifei Wang, Jacky Keung, Zhenyu Mao, Jingyu Zhang, Yuchen Cao,
Abstract summary: C2C-MoLA is a framework that synergizes Mixture of Experts (MoE) with Low-Rank Adaptation (LoRA)<n>LoRA enables parameter-efficient updates for resource-conscious tuning.<n>Experiments on Chart2Code-160k show that the proposed model improves generation accuracy by up to 17%.
Score: 20.521717930460692
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Chart-to-code generation is a critical task in automated data visualization, translating complex chart structures into executable programs. While recent Multi-modal Large Language Models (MLLMs) improve chart representation, existing approaches still struggle to achieve cross-type generalization, memory efficiency, and modular design. To address these challenges, this paper proposes C2C-MoLA, a multimodal framework that synergizes Mixture of Experts (MoE) with Low-Rank Adaptation (LoRA). The MoE component uses a complexity-aware routing mechanism with domain-specialized experts and load-balanced sparse gating, dynamically allocating inputs based on learnable structural metrics like element count and chart complexity. LoRA enables parameter-efficient updates for resource-conscious tuning, further supported by a tailored training strategy that aligns routing stability with semantic accuracy. Experiments on Chart2Code-160k show that the proposed model improves generation accuracy by up to 17%, reduces peak GPU memory by 18%, and accelerates convergence by 20%, when compared to standard fine-tuning and LoRA-only baselines, particularly on complex charts. Ablation studies validate optimal designs, such as 8 experts and rank-8 LoRA, and confirm scalability for real-world multimodal code generation.

Related papers

Architecture-Aware Multi-Design Generation for Repository-Level Feature Addition [53.50448142467294]
RAIM is a multi-design and architecture-aware framework for repository-level feature addition.<n>It shifts away from linear patching by generating multiple diverse implementation designs.<n>Experiments on the NoCode-bench Verified dataset demonstrate that RAIM establishes a new state-of-the-art performance.
arXiv Detail & Related papers (2026-03-02T12:50:40Z)
ChartM$^3$: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension [15.798942458550515]
This study proposes an automated multi-stage code-driven pipeline for generating visual reasoning datasets.<n>We construct ChartM$3$, a multi-dimensional and multi-step dataset containing 38K charts and 142K Q&A pairs for training, along with 2,871 high-quality evaluation samples.
arXiv Detail & Related papers (2025-11-04T09:45:34Z)
L-MoE: End-to-End Training of a Lightweight Mixture of Low-Rank Adaptation Experts [10.21556794551883]
We present L-MoE: a Lightweight Mixture of LoRA Experts.<n>L-MoE redefines MoE experts as task-specialized, low-rank adapters.<n>We present the formal mathematical framework for L-MoE.
arXiv Detail & Related papers (2025-10-19T08:44:25Z)
DynaSwarm: Dynamically Graph Structure Selection for LLM-based Multi-agent System [0.276240219662896]
DynaSwarm is a dynamic framework that enhances multi-agent systems.<n>It uses an actor-critic reinforcement learning mechanism to optimize graph structures.<n>It also has a dynamic graph selector that adaptively chooses the optimal graph structure for each input sample.
arXiv Detail & Related papers (2025-07-31T05:52:30Z)
Dynamic Acoustic Model Architecture Optimization in Training for ASR [51.21112094223223]
DMAO is an architecture optimization framework that employs a grow-and-drop strategy to automatically reallocate parameters during training.<n>We evaluate DMAO through experiments with CTC onSpeech, TED-LIUM-v2 and Switchboard datasets.
arXiv Detail & Related papers (2025-06-16T07:47:34Z)
Structural Similarity-Inspired Unfolding for Lightweight Image Super-Resolution [88.20464308588889]
We propose a Structural Similarity-Inspired Unfolding (SSIU) method for efficient image SR.<n>This method is designed through unfolding an SR optimization function constrained by structural similarity.<n>Our model outperforms current state-of-the-art models, boasting lower parameter counts and reduced memory consumption.
arXiv Detail & Related papers (2025-06-13T14:29:40Z)
Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition [95.54406667705999]
Pangu Embedded is an efficient Large Language Model (LLM) reasoner developed on Ascend Neural Processing Units (NPUs)<n>It addresses the significant computational costs and inference latency challenges prevalent in existing reasoning-optimized LLMs.<n>It delivers rapid responses and state-of-the-art reasoning quality within a single, unified model architecture.
arXiv Detail & Related papers (2025-05-28T14:03:02Z)
Boosting Chart-to-Code Generation in MLLM via Dual Preference-Guided Refinement [16.22363384653305]
Multimodal Large Language Models (MLLMs) perform fine-grained visual parsing, precise code synthesis, and robust cross-modal reasoning.<n>We propose a dual preference-guided refinement framework that combines a feedback-driven, dual-modality reward mechanism with iterative preference learning.<n>Our framework significantly enhances the performance of general-purpose open-source MLLMs, enabling them to generate high-quality plotting code.
arXiv Detail & Related papers (2025-04-03T07:51:20Z)
METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling [101.69125547798514]
We build a vision-language model (VLM) based multi-agent framework for effective automatic chart generation.<n>We propose METAL, a multi-agent framework that decomposes the task of chart generation into the iterative collaboration among specialized agents.
arXiv Detail & Related papers (2025-02-24T21:01:39Z)
Towards Generalizable Trajectory Prediction Using Dual-Level Representation Learning And Adaptive Prompting [107.4034346788744]
Existing vehicle trajectory prediction models struggle with generalizability, prediction uncertainties, and handling complex interactions.<n>We propose Perceiver with Register queries (PerReg+), a novel trajectory prediction framework that introduces: (1) Dual-Level Representation Learning via Self-Distillation (SD) and Masked Reconstruction (MR), capturing global context and fine-grained details; (2) Enhanced Multimodality using register-based queries and pretraining, eliminating the need for clustering and suppression; and (3) Adaptive Prompt Tuning during fine-tuning, freezing the main architecture and optimizing a small number of prompts for efficient adaptation.
arXiv Detail & Related papers (2025-01-08T20:11:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.