Related papers: Natural language is not enough: Benchmarking multi-modal generative AI for Verilog generation

Natural language is not enough: Benchmarking multi-modal generative AI for Verilog generation

URL: http://arxiv.org/abs/2407.08473v1
Date: Thu, 11 Jul 2024 13:10:09 GMT
Title: Natural language is not enough: Benchmarking multi-modal generative AI for Verilog generation
Authors: Kaiyan Chang, Zhirong Chen, Yunhao Zhou, Wenlong Zhu, kun wang, Haobo Xu, Cangyuan Li, Mengdi Wang, Shengwen Liang, Huawei Li, Yinhe Han, Ying Wang,
Abstract summary: We introduce an open-source benchmark for multi-modal generative models tailored for Verilog synthesis from visual-linguistic inputs. We also introduce an open-source visual and natural language Verilog query language framework. Our results demonstrate a significant improvement in the multi-modal generated Verilog compared to queries based solely on natural language.
Score: 37.309663295844835
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Natural language interfaces have exhibited considerable potential in the automation of Verilog generation derived from high-level specifications through the utilization of large language models, garnering significant attention. Nevertheless, this paper elucidates that visual representations contribute essential contextual information critical to design intent for hardware architectures possessing spatial complexity, potentially surpassing the efficacy of natural-language-only inputs. Expanding upon this premise, our paper introduces an open-source benchmark for multi-modal generative models tailored for Verilog synthesis from visual-linguistic inputs, addressing both singular and complex modules. Additionally, we introduce an open-source visual and natural language Verilog query language framework to facilitate efficient and user-friendly multi-modal queries. To evaluate the performance of the proposed multi-modal hardware generative AI in Verilog generation tasks, we compare it with a popular method that relies solely on natural language. Our results demonstrate a significant accuracy improvement in the multi-modal generated Verilog compared to queries based solely on natural language. We hope to reveal a new approach to hardware design in the large-hardware-design-model era, thereby fostering a more diversified and productive approach to hardware design.

Related papers

QiMeng-CRUX: Narrowing the Gap between Natural Language and Verilog via Core Refined Understanding eXpression [48.84841760215598]
Large language models (LLMs) have shown promising capabilities in hardware description language (HDL) generation.<n>Existing approaches often rely on free-form natural language descriptions that are often ambiguous, redundant, and unstructured.<n>We treat hardware code generation as a complex transformation from an open-ended natural language space to a domain-specific, highly constrained target space.<n>We introduce Core Refined Understanding eXpression (CRUX), a structured intermediate space that captures the essential semantics of user intent while organizing the expression for precise Verilog code generation.
arXiv Detail & Related papers (2025-11-25T09:17:32Z)
RetrieveAll: A Multilingual Named Entity Recognition Framework with Large Language Models [7.867158538366131]
Existing multilingual NER methods face severe language interference during the multi-language adaptation process.<n>We propose RetrieveAll, a universal multilingual NER framework based on dynamic LoRA.<n>We introduce a cross-granularity knowledge augmented method that fully exploits the intrinsic potential of the data.
arXiv Detail & Related papers (2025-05-25T12:52:18Z)
Tevatron 2.0: Unified Document Retrieval Toolkit across Scale, Language, and Modality [74.59049806800176]
This demo paper highlights the Tevatron toolkit's key features, bridging academia and industry.<n>We showcase a unified dense retriever achieving strong multilingual and multimodal effectiveness.<n>We also release OmniEmbed, to the best of our knowledge, the first embedding model that unifies text, image document, video, and audio retrieval.
arXiv Detail & Related papers (2025-05-05T08:52:49Z)
Building A Unified AI-centric Language System: analysis, framework and future work [0.0]
This paper explores the design of a unified AI-centric language system. We propose a framework that translates diverse natural language inputs into a streamlined AI-friendly language.
arXiv Detail & Related papers (2025-02-06T20:32:57Z)
Boosting the Capabilities of Compact Models in Low-Data Contexts with Large Language Models and Retrieval-Augmented Generation [2.9921619703037274]
We propose a retrieval augmented generation (RAG) framework backed by a large language model (LLM) to correct the output of a smaller model for the linguistic task of morphological glossing. We leverage linguistic information to make up for the lack of data and trainable parameters, while allowing for inputs from written descriptive grammars interpreted and distilled through an LLM. We show that a compact, RAG-supported model is highly effective in data-scarce settings, achieving a new state-of-the-art for this task and our target languages.
arXiv Detail & Related papers (2024-10-01T04:20:14Z)
ARPA: A Novel Hybrid Model for Advancing Visual Word Disambiguation Using Large Language Models and Transformers [1.6541870997607049]
We present ARPA, an architecture that fuses the unparalleled contextual understanding of large language models with the advanced feature extraction capabilities of transformers. ARPA's introduction marks a significant milestone in visual word disambiguation, offering a compelling solution. We invite researchers and practitioners to explore the capabilities of our model, envisioning a future where such hybrid models drive unprecedented advancements in artificial intelligence.
arXiv Detail & Related papers (2024-08-12T10:15:13Z)
LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models [50.259006481656094]
We present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models. Our interface is designed to enhance the interpretability of the image patches, which are instrumental in generating an answer. We present a case study of how our application can aid in understanding failure mechanisms in a popular large multi-modal model: LLaVA.
arXiv Detail & Related papers (2024-04-03T23:57:34Z)
Engineering A Large Language Model From Scratch [0.0]
Atinuke is a Transformer-based neural network that optimises performance across various language tasks. It can emulate human-like language by extracting features and learning complex mappings. System achieves state-of-the-art results on natural language tasks whilst remaining interpretable and robust.
arXiv Detail & Related papers (2024-01-30T04:29:48Z)
TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wild [102.93338424976959]
We introduce TextBind, an almost annotation-free framework for empowering larger language models with the multi-turn interleaved instruction-following capabilities. Our approach requires only image-caption pairs and generates multi-turn multimodal instruction-response conversations from a language model. To accommodate interleaved image-text inputs and outputs, we devise MIM, a language model-centric architecture that seamlessly integrates image encoder and decoder models.
arXiv Detail & Related papers (2023-09-14T15:34:01Z)
Efficient Spoken Language Recognition via Multilabel Classification [53.662747523872305]
We show that our models obtain competitive results while being orders of magnitude smaller and faster than current state-of-the-art methods. Our multilabel strategy is more robust to unseen non-target languages compared to multiclass classification.
arXiv Detail & Related papers (2023-06-02T23:04:19Z)
Multimodal Conditionality for Natural Language Generation [0.0]
MAnTiS is a general approach for multimodal conditionality in transformer-based Natural Language Generation models. We apply MAnTiS to the task of product description generation, conditioning a network on both product images and titles to generate descriptive text.
arXiv Detail & Related papers (2021-09-02T22:06:07Z)
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition [54.92161571089808]
Cross-lingual NER transfers knowledge from rich-resource language to languages with low resources. Existing cross-lingual NER methods do not make good use of rich unlabeled data in target languages. We develop a novel approach based on the ideas of semi-supervised learning and reinforcement learning.
arXiv Detail & Related papers (2021-06-01T05:46:22Z)
Exploring Software Naturalness through Neural Language Models [56.1315223210742]
The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing. We explore this hypothesis through the use of a pre-trained transformer-based language model to perform code analysis tasks.
arXiv Detail & Related papers (2020-06-22T21:56:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.