NL2Dashboard: A Lightweight and Controllable Framework for Generating Dashboards with LLMs
- URL: http://arxiv.org/abs/2601.06126v1
- Date: Sun, 04 Jan 2026 13:26:04 GMT
- Title: NL2Dashboard: A Lightweight and Controllable Framework for Generating Dashboards with LLMs
- Authors: Boshen Shi, Kexin Yang, Yuanbo Yang, Guanguang Chang, Ce Chi, Zhendong Wang, Xing Wang, Junlan Feng,
- Abstract summary: We propose NL2Dashboard, a lightweight framework for dashboard generation.<n>We introduce a structured intermediate representation (IR) that encapsulates the dashboard's content, layout, and visual elements.<n>Building upon this framework, we develop a multi-agent system in which the IR-driven algorithm is instantiated as a suite of tools.
- Score: 38.148675833251325
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While Large Language Models (LLMs) have demonstrated remarkable proficiency in generating standalone charts, synthesizing comprehensive dashboards remains a formidable challenge. Existing end-to-end paradigms, which typically treat dashboard generation as a direct code generation task (e.g., raw HTML), suffer from two fundamental limitations: representation redundancy due to massive tokens spent on visual rendering, and low controllability caused by the entanglement of analytical reasoning and presentation. To address these challenges, we propose NL2Dashboard, a lightweight framework grounded in the principle of Analysis-Presentation Decoupling. We introduce a structured intermediate representation (IR) that encapsulates the dashboard's content, layout, and visual elements. Therefore, it confines the LLM's role to data analysis and intent translation, while offloading visual synthesis to a deterministic rendering engine. Building upon this framework, we develop a multi-agent system in which the IR-driven algorithm is instantiated as a suite of tools. Comprehensive experiments conducted with this system demonstrate that NL2Dashboard significantly outperforms state-of-the-art baselines across diverse domains, achieving superior visual quality, significantly higher token efficiency, and precise controllability in both generation and modification tasks.
Related papers
- LeGend: A Data-Driven Framework for Lemma Generation in Hardware Model Checking [3.5968163492929346]
Property checking of RTL designs is a central task in formal verification.<n>We introduce LeGend, which replaces a per-clause graph analysis paradigm with one-time global representation learning.<n> Experiments show LeGend accelerates two state-of-the-art IC3/PDR engines across a diverse set of benchmarks.
arXiv Detail & Related papers (2026-02-27T13:34:48Z) - ViTCoP: Accelerating Large Vision-Language Models via Visual and Textual Semantic Collaborative Pruning [8.933549837045932]
Large Vision-Language Models incur high computational costs due to significant redundancy in their visual tokens.<n>We propose a Visual and Textual Semantic Collaborative Pruning framework (ViTCoP) that combines redundancy filtering in the vision encoder with step-wise co-pruning within the Large Language Models.
arXiv Detail & Related papers (2026-01-25T12:47:30Z) - Revisiting Multi-Task Visual Representation Learning [52.93947931352643]
We introduce MTV, a principled multi-task visual pretraining framework.<n>We leverage high-capacity "expert" models to synthesize dense, structured pseudo-labels at scale.<n>Our results demonstrate that MTV achieves "best-of-both-worlds" performance.
arXiv Detail & Related papers (2026-01-20T11:59:19Z) - OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent [58.07447442040785]
We introduce OS-Symphony, a holistic framework that comprises an Orchestrator coordinating two key innovations for robust automation.<n>Results demonstrate that OS-Symphony delivers substantial performance gains across varying model scales.
arXiv Detail & Related papers (2026-01-12T17:55:51Z) - Generative Editing in the Joint Vision-Language Space for Zero-Shot Composed Image Retrieval [11.724675700368316]
Composed Image Retrieval (CIR) enables fine-grained visual search by combining a reference image with a textual modification.<n>We propose Fusion-Diff, a novel generative editing framework with high effectiveness and data efficiency designed for multimodal alignment.
arXiv Detail & Related papers (2025-12-01T13:04:55Z) - LTD-Bench: Evaluating Large Language Models by Letting Them Draw [57.237152905238084]
LTD-Bench is a breakthrough benchmark for large language models (LLMs)<n>It transforms LLM evaluation from abstract scores to directly observable visual outputs by requiring models to generate drawings through dot matrices or executable code.<n> LTD-Bench's visual outputs enable powerful diagnostic analysis, offering a potential approach to investigate model similarity.
arXiv Detail & Related papers (2025-11-04T08:11:23Z) - VisuRiddles: Fine-grained Perception is a Primary Bottleneck for Multimodal Large Language Models in Abstract Visual Reasoning [70.44416154144001]
Recent strides in multimodal large language models (MLLMs) have significantly advanced their performance in many reasoning tasks.<n> Abstract Visual Reasoning (AVR) remains a critical challenge, primarily due to limitations in perceiving abstract graphics.<n>We propose VisuRiddles, a benchmark for PRS, featuring tasks meticulously constructed to assess models' reasoning capacities.<n>Second, we introduce the Perceptual Riddle Synthesizer (PRS), an automated framework for generating riddles with fine-grained perceptual descriptions.
arXiv Detail & Related papers (2025-06-03T07:24:00Z) - Diffusion Augmented Retrieval: A Training-Free Approach to Interactive Text-to-Image Retrieval [7.439049772394586]
Diffusion Augmented Retrieval (DAR) is a framework that generates multiple intermediate representations via dialogue refinements and DMs.<n>DAR results on par with finetuned I-TIR models, yet without incurring their tuning overhead.
arXiv Detail & Related papers (2025-01-26T03:29:18Z) - EMMA: Efficient Visual Alignment in Multi-Modal LLMs [56.03417732498859]
EMMA is a lightweight cross-modality module designed to efficiently fuse visual and textual encodings.<n>EMMA boosts performance across multiple tasks by up to 9.3% while significantly improving robustness against hallucinations.
arXiv Detail & Related papers (2024-10-02T23:00:31Z) - Multi-View Empowered Structural Graph Wordification for Language Models [12.22063024099311]
We introduce an end-to-end modality-aligning framework for LLM-graph alignment: Dual-Residual Vector Quantized-Variational AutoEncoder, namely Dr.E.<n>Our approach is purposefully designed to facilitate token-level alignment with LLMs, enabling an effective translation of the intrinsic'of graphs into comprehensible natural language.<n>Our framework ensures certain visual interpretability, efficiency, and robustness, marking the promising successful endeavor to achieve token-level alignment between LLMs and GNNs.
arXiv Detail & Related papers (2024-06-19T16:43:56Z) - EfficientLPS: Efficient LiDAR Panoptic Segmentation [30.249379810530165]
We present the novel Efficient LiDAR Panoptic architecture that addresses multiple challenges in segmenting LiDAR point clouds.
EfficientLPS comprises of a novel shared backbone that encodes with strengthened geometric transformation modeling capacity.
We benchmark our proposed model on two large-scale LiDAR datasets.
arXiv Detail & Related papers (2021-02-16T08:14:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.