Related papers: A Guide to Large Language Models in Modeling and Simulation: From Core Techniques to Critical Challenges

A Guide to Large Language Models in Modeling and Simulation: From Core Techniques to Critical Challenges

URL: http://arxiv.org/abs/2602.05883v1
Date: Thu, 05 Feb 2026 17:00:07 GMT
Title: A Guide to Large Language Models in Modeling and Simulation: From Core Techniques to Critical Challenges
Authors: Philippe J. Giabbanelli,
Abstract summary: We aim to provide comprehensive and practical guidance on how to use large language models (LLMs)<n>We discuss common sources of confusion, including non-determinism, knowledge augmentation, and decomposition of M&S data.<n>We emphasize principled design choices, diagnostic strategies, and empirical evaluation.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large language models (LLMs) have rapidly become familiar tools to researchers and practitioners. Concepts such as prompting, temperature, or few-shot examples are now widely recognized, and LLMs are increasingly used in Modeling & Simulation (M&S) workflows. However, practices that appear straightforward may introduce subtle issues, unnecessary complexity, or may even lead to inferior results. Adding more data can backfire (e.g., deteriorating performance through model collapse or inadvertently wiping out existing guardrails), spending time on fine-tuning a model can be unnecessary without a prior assessment of what it already knows, setting the temperature to 0 is not sufficient to make LLMs deterministic, providing a large volume of M&S data as input can be excessive (LLMs cannot attend to everything) but naive simplifications can lose information. We aim to provide comprehensive and practical guidance on how to use LLMs, with an emphasis on M&S applications. We discuss common sources of confusion, including non-determinism, knowledge augmentation (including RAG and LoRA), decomposition of M&S data, and hyper-parameter settings. We emphasize principled design choices, diagnostic strategies, and empirical evaluation, with the goal of helping modelers make informed decisions about when, how, and whether to rely on LLMs.

Related papers

LLM Enhancement with Domain Expert Mental Model to Reduce LLM Hallucination with Causal Prompt Engineering [0.3437656066916039]
We propose a technology based on optimized human-machine dialogue and monotone Boolean and k-valued functions to discover a computationally tractable personal expert mental model (EMM) of decision-making.<n>Our EMM algorithm for LLM prompt engineering has four steps: factor identification, (2) hierarchical structuring of factors, (3) generating a generalized expert mental model specification, and (4) generating a detailed generalized expert mental model from that specification.
arXiv Detail & Related papers (2025-09-13T14:35:51Z)
Language Models Coupled with Metacognition Can Outperform Reasoning Models [32.32646975975768]
Large language models (LLMs) excel in speed and adaptability across various reasoning tasks.<n>LRMs are specifically designed for complex, step-by-step reasoning.<n> SOFAI-LM coordinates a fast LLM with a slower but more powerful LRM through metacognition.
arXiv Detail & Related papers (2025-08-25T12:19:57Z)
Applying Large Language Models to Travel Satisfaction Analysis [2.5105418815378555]
This study uses household survey data collected in Shanghai to identify the existence and source of misalignment between Large Language Models (LLMs) and humans.<n>LLMs have strongcapabilities in contextual understanding and generalization, significantly reducing dependence on task-specific data.<n>We propose an LLM-based modeling approach that can be applied to model travel behavior with small sample sizes.
arXiv Detail & Related papers (2025-05-29T09:11:58Z)
Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.<n>However, they still struggle with problems requiring multi-step decision-making and environmental feedback.<n>We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities [51.594836904623534]
We investigate whether instruction-tuned models possess fundamentally different capabilities from base models that are prompted using in-context examples.<n>We show that the performance of instruction-tuned models is significantly correlated with the in-context performance of their base counterparts.<n>Specifically, we extend this understanding to instruction-tuned models, suggesting that their pretraining data similarly sets a limiting boundary on the tasks they can solve.
arXiv Detail & Related papers (2025-01-15T10:57:55Z)
Detecting LLM Hallucination Through Layer-wise Information Deficiency: Analysis of Ambiguous Prompts and Unanswerable Questions [60.31496362993982]
Large language models (LLMs) frequently generate confident yet inaccurate responses.<n>We present a novel, test-time approach to detecting model hallucination through systematic analysis of information flow.
arXiv Detail & Related papers (2024-12-13T16:14:49Z)
Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data [54.934578742209716]
In real-world NLP applications, Large Language Models (LLMs) offer promising solutions due to their extensive training on vast datasets.<n>LLKD is an adaptive sample selection method that incorporates signals from both the teacher and student.<n>Our comprehensive experiments show that LLKD achieves superior performance across various datasets with higher data efficiency.
arXiv Detail & Related papers (2024-11-12T18:57:59Z)
Large Language Models Must Be Taught to Know What They Don't Know [97.90008709512921]
We show that fine-tuning on a small dataset of correct and incorrect answers can create an uncertainty estimate with good generalization and small computational overhead.<n>We also investigate the mechanisms that enable reliable uncertainty estimation, finding that many models can be used as general-purpose uncertainty estimators.
arXiv Detail & Related papers (2024-06-12T16:41:31Z)
Adapting Large Language Models for Content Moderation: Pitfalls in Data Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains. In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.