AI and Machine Learning for Next Generation Science Assessments
- URL: http://arxiv.org/abs/2405.06660v1
- Date: Tue, 23 Apr 2024 01:39:20 GMT
- Title: AI and Machine Learning for Next Generation Science Assessments
- Authors: Xiaoming Zhai,
- Abstract summary: This chapter focuses on the transformative role of Artificial Intelligence (AI) and Machine Learning (ML) in science assessments.
The paper begins with a discussion of the Framework for K-12 Science Education, which calls for a shift from conceptual learning to knowledge-in-use.
The paper achieves three major goals: reviewing the current state of ML-based assessments in science education, introducing a framework for scoring accuracy in ML-based automatic assessments, and discussing future directions and challenges.
- Score: 0.7416846035207727
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This chapter focuses on the transformative role of Artificial Intelligence (AI) and Machine Learning (ML) in science assessments. The paper begins with a discussion of the Framework for K-12 Science Education, which calls for a shift from conceptual learning to knowledge-in-use. This shift necessitates the development of new types of assessments that align with the Framework's three dimensions: science and engineering practices, disciplinary core ideas, and crosscutting concepts. The paper further highlights the limitations of traditional assessment methods like multiple-choice questions, which often fail to capture the complexities of scientific thinking and three-dimensional learning in science. It emphasizes the need for performance-based assessments that require students to engage in scientific practices like modeling, explanation, and argumentation. The paper achieves three major goals: reviewing the current state of ML-based assessments in science education, introducing a framework for scoring accuracy in ML-based automatic assessments, and discussing future directions and challenges. It delves into the evolution of ML-based automatic scoring systems, discussing various types of ML, like supervised, unsupervised, and semi-supervised learning. These systems can provide timely and objective feedback, thus alleviating the burden on teachers. The paper concludes by exploring pre-trained models like BERT and finetuned ChatGPT, which have shown promise in assessing students' written responses effectively.
Related papers
- MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs [97.94579295913606]
Multimodal Large Language Models (MLLMs) have garnered increased attention from both industry and academia.
In the development process, evaluation is critical since it provides intuitive feedback and guidance on improving models.
This work aims to offer researchers an easy grasp of how to effectively evaluate MLLMs according to different needs and to inspire better evaluation methods.
arXiv Detail & Related papers (2024-11-22T18:59:54Z) - Good Idea or Not, Representation of LLM Could Tell [86.36317971482755]
We focus on idea assessment, which aims to leverage the knowledge of large language models to assess the merit of scientific ideas.
We release a benchmark dataset from nearly four thousand manuscript papers with full texts, meticulously designed to train and evaluate the performance of different approaches to this task.
Our findings suggest that the representations of large language models hold more potential in quantifying the value of ideas than their generative outputs.
arXiv Detail & Related papers (2024-09-07T02:07:22Z) - Recent Advances on Machine Learning for Computational Fluid Dynamics: A Survey [51.87875066383221]
This paper introduces fundamental concepts, traditional methods, and benchmark datasets, then examine the various roles Machine Learning plays in improving CFD.
We highlight real-world applications of ML for CFD in critical scientific and engineering disciplines, including aerodynamics, combustion, atmosphere & ocean science, biology fluid, plasma, symbolic regression, and reduced order modeling.
We draw the conclusion that ML is poised to significantly transform CFD research by enhancing simulation accuracy, reducing computational time, and enabling more complex analyses of fluid dynamics.
arXiv Detail & Related papers (2024-08-22T07:33:11Z) - RelevAI-Reviewer: A Benchmark on AI Reviewers for Survey Paper Relevance [0.8089605035945486]
We propose RelevAI-Reviewer, an automatic system that conceptualizes the task of survey paper review as a classification problem.
We introduce a novel dataset comprised of 25,164 instances. Each instance contains one prompt and four candidate papers, each varying in relevance to the prompt.
We develop a machine learning (ML) model capable of determining the relevance of each paper and identifying the most pertinent one.
arXiv Detail & Related papers (2024-06-13T06:42:32Z) - LOVA3: Learning to Visual Question Answering, Asking and Assessment [61.51687164769517]
Question answering, asking, and assessment are three innate human traits crucial for understanding the world and acquiring knowledge.
Current Multimodal Large Language Models (MLLMs) primarily focus on question answering, often neglecting the full potential of questioning and assessment skills.
We introduce LOVA3, an innovative framework named "Learning tO Visual question Answering, Asking and Assessment"
arXiv Detail & Related papers (2024-05-23T18:21:59Z) - A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science [3.124884279860061]
Our study focuses on employing GPT-4 for automated assessment in middle school Earth Science.
A systematic analysis of our method's pros and cons sheds light on the potential for human-in-the-loop techniques to enhance automated grading.
arXiv Detail & Related papers (2024-03-21T17:09:08Z) - MachineLearnAthon: An Action-Oriented Machine Learning Didactic Concept [34.6229719907685]
This paper introduces the MachineLearnAthon format, an innovative didactic concept designed to be inclusive for students of different disciplines.
At the heart of the concept lie ML challenges, which make use of industrial data sets to solve real-world problems.
These cover the entire ML pipeline, promoting data literacy and practical skills, from data preparation, through deployment, to evaluation.
arXiv Detail & Related papers (2024-01-29T16:50:32Z) - Automatic assessment of text-based responses in post-secondary
education: A systematic review [0.0]
There is immense potential to automate rapid assessment and feedback of text-based responses in education.
To understand how text-based automatic assessment systems have been developed and applied in education in recent years, three research questions are considered.
This systematic review provides an overview of recent educational applications of text-based assessment systems.
arXiv Detail & Related papers (2023-08-30T17:16:45Z) - Practical and Ethical Challenges of Large Language Models in Education:
A Systematic Scoping Review [5.329514340780243]
Large language models (LLMs) have the potential to automate the laborious process of generating and analysing textual content.
There are concerns regarding the practicality and ethicality of these innovations.
We conducted a systematic scoping review of 118 peer-reviewed papers published since 2017 to pinpoint the current state of research.
arXiv Detail & Related papers (2023-03-17T18:14:46Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - Don't Copy the Teacher: Data and Model Challenges in Embodied Dialogue [92.01165203498299]
Embodied dialogue instruction following requires an agent to complete a complex sequence of tasks from a natural language exchange.
This paper argues that imitation learning (IL) and related low-level metrics are actually misleading and do not align with the goals of embodied dialogue research.
arXiv Detail & Related papers (2022-10-10T05:51:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.