Three Questions Concerning the Use of Large Language Models to
Facilitate Mathematics Learning
- URL: http://arxiv.org/abs/2310.13615v1
- Date: Fri, 20 Oct 2023 16:05:35 GMT
- Title: Three Questions Concerning the Use of Large Language Models to
Facilitate Mathematics Learning
- Authors: An-Zi Yen and Wei-Ling Hsu
- Abstract summary: We discuss the challenges associated with employing large language models to enhance students' mathematical problem-solving skills.
LLMs can generate the wrong reasoning processes, and also exhibit difficulty in understanding the given questions' rationales when attempting to correct students' answers.
- Score: 4.376598435975689
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the remarkable language understanding and generation abilities of
large language models (LLMs), their use in educational applications has been
explored. However, little work has been done on investigating the pedagogical
ability of LLMs in helping students to learn mathematics. In this position
paper, we discuss the challenges associated with employing LLMs to enhance
students' mathematical problem-solving skills by providing adaptive feedback.
Apart from generating the wrong reasoning processes, LLMs can misinterpret the
meaning of the question, and also exhibit difficulty in understanding the given
questions' rationales when attempting to correct students' answers. Three
research questions are formulated.
Related papers
- Automate Knowledge Concept Tagging on Math Questions with LLMs [48.5585921817745]
Knowledge concept tagging for questions plays a crucial role in contemporary intelligent educational applications.
Traditionally, these annotations have been conducted manually with help from pedagogical experts.
In this paper, we explore the automating the tagging task using Large Language Models (LLMs)
arXiv Detail & Related papers (2024-03-26T00:09:38Z) - GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers [68.77382332826167]
Large language models (LLMs) have achieved impressive performance across various mathematical reasoning benchmarks.
One essential and frequently occurring evidence is that when the math questions are slightly changed, LLMs can behave incorrectly.
This motivates us to evaluate the robustness of LLMs' math reasoning capability by testing a wide range of question variations.
arXiv Detail & Related papers (2024-02-29T15:26:14Z) - Adversarial Math Word Problem Generation [6.92510069380188]
We propose a new paradigm for ensuring fair evaluation of large language models (LLMs)
We generate adversarial examples which preserve the structure and difficulty of the original questions aimed for assessment, but are unsolvable by LLMs.
We conduct experiments on various open- and closed-source LLMs, quantitatively and qualitatively demonstrating that our method significantly degrades their math problem-solving ability.
arXiv Detail & Related papers (2024-02-27T22:07:52Z) - Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? [140.9751389452011]
We study the biases of large language models (LLMs) in relation to those known in children when solving arithmetic word problems.
We generate a novel set of word problems for each of these tests, using a neuro-symbolic approach that enables fine-grained control over the problem features.
arXiv Detail & Related papers (2024-01-31T18:48:20Z) - Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges [60.62904929065257]
Large language models (LLMs) offer possibility for resolving this issue by comprehending individual requests.
This paper reviews the recently emerged LLM research related to educational capabilities, including mathematics, writing, programming, reasoning, and knowledge-based question answering.
arXiv Detail & Related papers (2023-12-27T14:37:32Z) - Democratizing Reasoning Ability: Tailored Learning from Large Language
Model [97.4921006089966]
We propose a tailored learning approach to distill such reasoning ability to smaller LMs.
We exploit the potential of LLM as a reasoning teacher by building an interactive multi-round learning paradigm.
To exploit the reasoning potential of the smaller LM, we propose self-reflection learning to motivate the student to learn from self-made mistakes.
arXiv Detail & Related papers (2023-10-20T07:50:10Z) - Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of
Large Language Models with Misconceptions [28.759189115877028]
We propose novel evaluations for mathematical reasoning capabilities of Large Language Models (LLMs) based on mathematical misconceptions.
Our primary approach is to simulate LLMs as a novice learner and an expert tutor, aiming to identify the incorrect answer to math question resulted from a specific misconception.
arXiv Detail & Related papers (2023-10-03T21:19:50Z) - Evaluating Language Models for Mathematics through Interactions [116.67206980096513]
We introduce CheckMate, a prototype platform for humans to interact with and evaluate large language models (LLMs)
We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics.
We derive a taxonomy of human behaviours and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness.
arXiv Detail & Related papers (2023-06-02T17:12:25Z) - Automatic Generation of Socratic Subquestions for Teaching Math Word
Problems [16.97827669744673]
We explore the ability of large language models (LMs) in generating sequential questions for guiding math word problem-solving.
On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions.
Results suggest that the difficulty level of problems plays an important role in determining whether questioning improves or hinders human performance.
arXiv Detail & Related papers (2022-11-23T10:40:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.