Related papers: Learning Progression-Guided AI Evaluation of Scientific Models To Support Diverse Multi-Modal Understanding in NGSS Classroom

Learning Progression-Guided AI Evaluation of Scientific Models To Support Diverse Multi-Modal Understanding in NGSS Classroom

URL: http://arxiv.org/abs/2509.18157v1
Date: Tue, 16 Sep 2025 22:12:15 GMT
Title: Learning Progression-Guided AI Evaluation of Scientific Models To Support Diverse Multi-Modal Understanding in NGSS Classroom
Authors: Leonora Kaldaras, Tingting Li, Prudence Djagba, Kevin Haudek, Joseph Krajcik,
Abstract summary: We build on a validated NGSS-aligned multi-modal LP reflecting diverse ways of modeling and explaining electrostatic phenomena.<n>We show how LP guides the design of personalized ML-driven feedback grounded in the diversity of student thinking on both assessment modes.
Score: 2.6572245224872835
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning Progressions (LPs) can help adjust instruction to individual learners needs if the LPs reflect diverse ways of thinking about a construct being measured, and if the LP-aligned assessments meaningfully measure this diversity. The process of doing science is inherently multi-modal with scientists utilizing drawings, writing and other modalities to explain phenomena. Thus, fostering deep science understanding requires supporting students in using multiple modalities when explaining phenomena. We build on a validated NGSS-aligned multi-modal LP reflecting diverse ways of modeling and explaining electrostatic phenomena and associated assessments. We focus on students modeling, an essential practice for building a deep science understanding. Supporting culturally and linguistically diverse students in building modeling skills provides them with an alternative mode of communicating their understanding, essential for equitable science assessment. Machine learning (ML) has been used to score open-ended modeling tasks (e.g., drawings), and short text-based constructed scientific explanations, both of which are time- consuming to score. We use ML to evaluate LP-aligned scientific models and the accompanying short text-based explanations reflecting multi-modal understanding of electrical interactions in high school Physical Science. We show how LP guides the design of personalized ML-driven feedback grounded in the diversity of student thinking on both assessment modes.

Related papers

Simulating Students with Large Language Models: A Review of Architecture, Mechanisms, and Role Modelling in Education with Generative AI [0.8703455323398351]
Review of studies using large language models (LLMs) to simulate student behaviour across educational environments.<n>Wee current evidence on the capacity of LLM-based agents to emulate learner archetypes, respond to instructional inputs, and interact within multi-agent classroom scenarios.<n>We examine the implications of such systems for curriculum development, instructional evaluation, and teacher training.
arXiv Detail & Related papers (2025-11-08T17:23:13Z)
Generative Large Language Models for Knowledge Representation: A Systematic Review of Concept Map Generation [1.163826615891678]
The rise of generative large language models (LLMs) has opened new opportunities for automating knowledge representation through concept maps.<n>This review systematically synthesizes the emerging body of research on LLM-enabled concept map generation.<n>Findings reveal six major methodological categories: human-in-the-loop systems, weakly supervised learning models, fine-tuned domain-specific LLMs, pre-trained LLMs with prompt engineering, hybrid systems integrating knowledge bases, and modular frameworks combining symbolic and statistical tools.
arXiv Detail & Related papers (2025-09-18T02:36:54Z)
Automated Feedback on Student-Generated UML and ER Diagrams Using Large Language Models [39.58317527488534]
We introduce DUET (Diamatic & ER Tutor), a prototype of an LLM-based tool.<n>It converts a reference diagram and a student-submitted diagram into a textual representation and provides structured feedback based on the differences.<n>It uses a multi-stage LLM pipeline to compare diagrams and generate reflective feedback.<n>It enables analytical insights for educators, aiming to foster self-directed learning and inform instructional strategies.
arXiv Detail & Related papers (2025-07-31T11:49:01Z)
Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study [50.065744358362345]
Large language models (LLMs) have shown impressive capabilities across tasks such as mathematics, coding, and reasoning.<n>Yet their learning ability, which is crucial for adapting to dynamic environments and acquiring new knowledge, remains underexplored.
arXiv Detail & Related papers (2025-06-16T13:24:50Z)
MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science [62.96434290874878]
Current Multi-Modal Large Language Models (MLLM) have shown strong capabilities in general visual reasoning tasks.<n>We develop a new framework, named Multi-Modal Scientific Reasoning with Physics Perception and Simulation (MAPS) based on an MLLM.<n>MAPS decomposes expert-level multi-modal reasoning task into physical diagram understanding via a Physical Perception Model (PPM) and reasoning with physical knowledge via a simulator.
arXiv Detail & Related papers (2025-01-18T13:54:00Z)
Large Physics Models: Towards a collaborative approach with Large Language Models and Foundation Models [8.320153035338418]
This paper explores ideas and provides a potential roadmap for the development and evaluation of physics-specific large-scale AI models.<n>These models, based on foundation models such as Large Language Models (LLMs) are tailored to address the demands of physics research.
arXiv Detail & Related papers (2025-01-09T17:11:22Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, a framework for better data construction and model tuning.<n>For insufficient data usage, we incorporate strategies such as Chain-of-Thought prompting and anti-induction.<n>For rigid behavior patterns, we design the tuning process and introduce automated DPO to enhance the specificity and dynamism of the models' personalities.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
Evaluating Large Language Models with Psychometrics [59.821829073478376]
This paper offers a comprehensive benchmark for quantifying psychological constructs of Large Language Models (LLMs)<n>Our work identifies five key psychological constructs -- personality, values, emotional intelligence, theory of mind, and self-efficacy -- assessed through a suite of 13 datasets.<n>We uncover significant discrepancies between LLMs' self-reported traits and their response patterns in real-world scenarios, revealing complexities in their behaviors.
arXiv Detail & Related papers (2024-06-25T16:09:08Z)
LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery [141.39722070734737]
We propose to enhance the knowledge-driven, abstract reasoning abilities of Large Language Models with the computational strength of simulations. We introduce Scientific Generative Agent (SGA), a bilevel optimization framework. We conduct experiments to demonstrate our framework's efficacy in law discovery and molecular design.
arXiv Detail & Related papers (2024-05-16T03:04:10Z)
A quantitative analysis of knowledge-learning preferences in large language models in molecular science [24.80165173525286]
Large language models (LLMs) introduce a fresh research paradigm to tackle scientific problems from a natural language processing (NLP) perspective.<n>LLMs significantly enhance our understanding and generation of molecules, often surpassing existing methods with their capabilities to decode and synthesize complex molecular patterns.<n>We propose a multi-modal benchmark, named ChEBI-20-MM, and perform 1263 experiments to assess the model's compatibility with data modalities and knowledge acquisition.
arXiv Detail & Related papers (2024-02-06T16:12:36Z)
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models [57.96527452844273]
We introduce SciInstruct, a suite of scientific instructions for training scientific language models capable of college-level scientific reasoning. We curated a diverse and high-quality dataset encompassing physics, chemistry, math, and formal proofs. To verify the effectiveness of SciInstruct, we fine-tuned different language models with SciInstruct, i.e., ChatGLM3 (6B and 32B), Llama3-8B-Instruct, and Mistral-7B: MetaMath.
arXiv Detail & Related papers (2024-01-15T20:22:21Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides [57.86931911522967]
We test the capabilities of machine learning models in multimodal understanding of educational content. Our dataset contains aligned slides and spoken language, for 180+ hours of video and 9000+ slides, with 10 lecturers from various subjects. We introduce PolyViLT, a multimodal transformer trained with a multi-instance learning loss that is more effective than current approaches.
arXiv Detail & Related papers (2022-08-17T05:30:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.