Related papers: Can Low-Rank Knowledge Distillation in LLMs be Useful for Microelectronic Reasoning?

Related papers

Fine-Tuning Large Language Models Using EEG Microstate Features for Mental Workload Assessment [0.0]
This study explores the intersection of electroencephalography (EEG) microstates and Large Language Models (LLMs)<n>The research aims to fine-tune LLMs for improved predictions of distinct cognitive states, specifically 'Rest' and 'Load'
arXiv Detail & Related papers (2025-08-10T10:43:09Z)
From Text to Trajectories: GPT-2 as an ODE Solver via In-Context [44.198609457344574]
In-Context Learning (ICL) has emerged as a new paradigm in large language models (LLMs)<n>This paper investigates whether LLMs can solve ordinary differential equations (ODEs) under the ICL setting.<n> Experiments on two types of ODEs show that GPT-2 can effectively learn a meta-ODE algorithm, with convergence behavior comparable to, or better than, the Euler method.
arXiv Detail & Related papers (2025-08-05T03:16:37Z)
Expert-Guided LLM Reasoning for Battery Discovery: From AI-Driven Hypothesis to Synthesis and Characterization [47.97016882216093]
Large language models (LLMs) leverage chain-of-thought (CoT) techniques to tackle complex problems.<n>We introduce ChatBattery, a novel agentic framework that integrates domain knowledge to steer LLMs toward more effective reasoning in materials design.<n>We successfully identify, synthesize, and characterize three novel lithium-ion battery cathode materials, which achieve practical capacity improvements of 28.8%, 25.2%, and 18.5%, respectively.
arXiv Detail & Related papers (2025-07-21T23:46:11Z)
LLM-based AI Agent for Sizing of Analog and Mixed Signal Circuit [2.979579757819132]
Large Language Models (LLMs) have demonstrated significant potential across various fields. In this work, we propose an LLM-based AI agent for AMS circuit design to assist in the sizing process.
arXiv Detail & Related papers (2025-04-14T22:18:16Z)
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications. The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard. We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z)
DropMicroFluidAgents (DMFAs): Autonomous Droplet Microfluidic Research Framework Through Large Language Model Agents [0.6827423171182153]
This study demonstrates the effective use of Large language models (LLMs) in droplet microfluidics research. The integration of DMFAs with the LLAMA3.1 model yielded the highest accuracy of 76.15%. These capabilities enable their application across education and industrial support, driving greater efficiency in scientific discovery and innovation.
arXiv Detail & Related papers (2024-12-30T11:58:52Z)
ElectroVizQA: How well do Multi-modal LLMs perform in Electronics Visual Question Answering? [6.471546061182191]
This paper rigorously assesses the extent to which MLLMs can understand and solve digital electronic circuit questions. By introducing this benchmark dataset, we aim to motivate further research and development in the application of MLLMs to engineering education.
arXiv Detail & Related papers (2024-11-27T20:25:07Z)
LLAVADI: What Matters For Multimodal Large Language Models Distillation [77.73964744238519]
In this work, we do not propose a new efficient model structure or train small-scale MLLMs from scratch. Our studies involve training strategies, model choices, and distillation algorithms in the knowledge distillation process. By evaluating different benchmarks and proper strategy, even a 2.7B small-scale model can perform on par with larger models with 7B or 13B parameters.
arXiv Detail & Related papers (2024-07-28T06:10:47Z)
Less for More: Enhanced Feedback-aligned Mixed LLMs for Molecule Caption Generation and Fine-Grained NLI Evaluation [11.778576032848482]
This work enhances such models by improving their inference and evaluation capabilities with minimal or no additional training. We reveal intriguing insights into the behaviour and suitability of such methods while significantly surpassing state-of-the-art models. We propose a novel atomic-level evaluation method leveraging off-the-shelf Natural Language Inference (NLI) models for use in the unseen chemical domain.
arXiv Detail & Related papers (2024-05-22T20:40:53Z)
Investigating Automatic Scoring and Feedback using Large Language Models [46.1232919707345]
This paper explores the efficacy of PEFT-based quantized models, employing classification or regression head, to fine-tune language models for automatic grading and feedback generation. The results show that prediction of grade scores via finetuned LLMs are highly accurate, achieving less than 3% error in grade percentage on average.
arXiv Detail & Related papers (2024-05-01T16:13:54Z)
ELAD: Explanation-Guided Large Language Models Active Distillation [16.243249111524403]
The deployment and application of Large Language Models (LLMs) is hindered by their memory inefficiency, computational demands, and the high costs of API inferences. Traditional distillation methods, which transfer the capabilities of LLMs to smaller models, often fail to determine whether the knowledge has been sufficiently transferred. We propose an Explanation-Guided LLMs Active Distillation (ELAD) framework that employs an active learning strategy to optimize the balance between annotation costs and model performance.
arXiv Detail & Related papers (2024-02-20T15:47:59Z)
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z)
Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study [90.34226812493083]
This work aims to investigate the impact of quantization on emphemergent abilities, which are important characteristics that distinguish LLMs from small language models. Our empirical experiments show that these emergent abilities still exist in 4-bit quantization models, while 2-bit models encounter severe performance degradation. To improve the performance of low-bit models, we conduct two special experiments: (1) fine-gained impact analysis that studies which components (or substructures) are more sensitive to quantization, and (2) performance compensation through model fine-tuning.
arXiv Detail & Related papers (2023-07-16T15:11:01Z)
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks [90.11273439036455]
Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks. We propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales from LLMs with augmented knowledge retrieved from an external knowledge base. We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets.
arXiv Detail & Related papers (2023-05-28T13:00:00Z)
Pre-training Language Model as a Multi-perspective Course Learner [103.17674402415582]
This study proposes a multi-perspective course learning (MCL) method for sample-efficient pre-training. In this study, three self-supervision courses are designed to alleviate inherent flaws of "tug-of-war" dynamics. Our method significantly improves ELECTRA's average performance by 2.8% and 3.2% absolute points respectively on GLUE and SQuAD 2.0 benchmarks.
arXiv Detail & Related papers (2023-05-06T09:02:10Z)
Learning Electron Bunch Distribution along a FEL Beamline by Normalising Flows [48.236222741059834]
We introduce a surrogate model based on normalising flows for conditional phase-space representation of electron clouds in a FEL beamline. Achieved results let us discuss further benefits and limitations in exploitability of the models to gain deeper understanding of fundamental processes within a beamline.
arXiv Detail & Related papers (2023-02-27T15:21:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.