Related papers: LearnLM: Improving Gemini for Learning

LearnLM: Improving Gemini for Learning

URL: http://arxiv.org/abs/2412.16429v2
Date: Wed, 25 Dec 2024 06:12:22 GMT
Title: LearnLM: Improving Gemini for Learning
Authors: LearnLM Team, Abhinit Modi, Aditya Srikanth Veerubhotla, Aliya Rysbek, Andrea Huber, Brett Wiltshire, Brian Veprek, Daniel Gillick, Daniel Kasenberg, Derek Ahmed, Irina Jurenka, James Cohan, Jennifer She, Julia Wilkowski, Kaiz Alarakyia, Kevin R. McKee, Lisa Wang, Markus Kunesch, Mike Schaekermann, Miruna Pîslar, Nikhil Joshi, Parsa Mahmoudieh, Paul Jhun, Sara Wiltberger, Shakir Mohamed, Shashank Agarwal, Shubham Milind Phal, Sun Jae Lee, Theofilos Strinopoulos, Wei-Jen Ko, Amy Wang, Ankit Anand, Avishkar Bhoopchand, Dan Wild, Divya Pandya, Filip Bar, Garth Graham, Holger Winnemoeller, Mahvish Nagda, Prateek Kolhar, Renee Schneider, Shaojian Zhu, Stephanie Chan, Steve Yadlowsky, Viknesh Sounderajah, Yannis Assael,
Abstract summary: generative AI systems are tuned to present information by default rather than engage users in service of learning as a human tutor.<n>We show how training with pedagogical instruction following produces a LearnLM model that is preferred substantially by expert raters.
Score: 8.530448114164443
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Today's generative AI systems are tuned to present information by default rather than engage users in service of learning as a human tutor would. To address the wide range of potential education use cases for these systems, we reframe the challenge of injecting pedagogical behavior as one of \textit{pedagogical instruction following}, where training and evaluation examples include system-level instructions describing the specific pedagogy attributes present or desired in subsequent model turns. This framing avoids committing our models to any particular definition of pedagogy, and instead allows teachers or developers to specify desired model behavior. It also clears a path to improving Gemini models for learning -- by enabling the addition of our pedagogical data to post-training mixtures -- alongside their rapidly expanding set of capabilities. Both represent important changes from our initial tech report. We show how training with pedagogical instruction following produces a LearnLM model (available on Google AI Studio) that is preferred substantially by expert raters across a diverse set of learning scenarios, with average preference strengths of 31\% over GPT-4o, 11\% over Claude 3.5, and 13\% over the Gemini 1.5 Pro model LearnLM was based on.

Related papers

Cultivating Helpful, Personalized, and Creative AI Tutors: A Framework for Pedagogical Alignment using Reinforcement Learning [17.558663729465692]
EduAlign is a framework designed to guide large language models (LLMs) toward becoming more effective and responsible educational assistants.<n>In the first stage, we curate a dataset of 8k educational interactions and annotate them-both manually and automatically-along three key educational dimensions: Helpfulness, Personalization, and Creativity.<n>In the second stage, we leverage HPC-RM as a reward signal to fine-tune a pre-trained LLM using Group Relative Policy Optimization (GRPO) on a set of 2k diverse prompts.
arXiv Detail & Related papers (2025-07-27T15:56:29Z)
Investigating Pedagogical Teacher and Student LLM Agents: Genetic Adaptation Meets Retrieval Augmented Generation Across Learning Style [16.985943868964394]
Effective teaching requires adapting instructional strategies to accommodate the diverse cognitive and behavioral profiles of students.<n>This paper introduces a novel simulation framework that integrates heterogeneous student agents with a self-optimizing teacher agent.<n>Our results highlight the potential of LLM-driven simulations to inform adaptive teaching practices and provide a testbed for training human educators in data-driven environments.
arXiv Detail & Related papers (2025-05-25T14:45:35Z)
From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning [76.09281171131941]
Large language models (LLMs) can transform education, but their optimization for direct question-answering often undermines effective pedagogy.<n>We propose an online reinforcement learning (RL)-based alignment framework that can quickly adapt LLMs into effective tutors.
arXiv Detail & Related papers (2025-05-21T15:00:07Z)
Supervised Fine-Tuning LLMs to Behave as Pedagogical Agents in Programming Education [41.69192181482715]
We present the development of GuideLM, a fine-tuned large language model (LLMs) for programming education.<n>GuideLM has been integrated into the C Compiler (DCC), an educational C compiler that leverages LLMs to generate pedagogically sound error explanations.<n>We conducted an expert analysis of 400 responses per model, comparing their pedagogical effectiveness against base OpenAI models.<n>Results indicate that GuideLM and GuideLM-mini improve pedagogical performance, with an 8% increase in Socratic guidance and a 58% improvement in economy of words compared to GPT-4o.
arXiv Detail & Related papers (2025-02-27T21:23:56Z)
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors [76.1634959528817]
We present MathTutorBench, an open-source benchmark for holistic tutoring model evaluation. MathTutorBench contains datasets and metrics that broadly cover tutor abilities as defined by learning sciences research in dialog-based teaching. We evaluate a wide set of closed- and open-weight models and find that subject expertise, indicated by solving ability, does not immediately translate to good teaching.
arXiv Detail & Related papers (2025-02-26T08:43:47Z)
Scalable Early Childhood Reading Performance Prediction [5.413138072912236]
There are no suitable publicly available educational datasets for modeling and predicting future reading performance.<n>In this work, we introduce the Enhanced Core Reading Instruction ECRI dataset.<n>We leverage the dataset to empirically evaluate the ability of state-of-the-art machine learning models to recognize early childhood educational patterns.
arXiv Detail & Related papers (2024-12-05T18:59:50Z)
Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models [62.5501109475725]
Knowledge distillation (KD) is a technique that compresses large teacher models by training smaller student models to mimic them. This paper introduces Online Knowledge Distillation (OKD), where the teacher network integrates small online modules to concurrently train with the student model. OKD achieves or exceeds the performance of leading methods in various model architectures and sizes, reducing training time by up to fourfold.
arXiv Detail & Related papers (2024-09-19T07:05:26Z)
Interactive DualChecker for Mitigating Hallucinations in Distilling Large Language Models [7.632217365130212]
Large Language Models (LLMs) have demonstrated exceptional capabilities across various machine learning (ML) tasks. These models can produce hallucinations, particularly in domains with incomplete knowledge. We introduce DualChecker, an innovative framework designed to mitigate hallucinations and improve the performance of both teacher and student models.
arXiv Detail & Related papers (2024-08-22T12:04:04Z)
BIPED: Pedagogically Informed Tutoring System for ESL Education [11.209992106075788]
Large Language Models (LLMs) have a great potential to serve as readily available and cost-efficient Conversational Intelligent Tutoring Systems (CITS) Existing CITS are designed to teach only simple concepts or lack the pedagogical depth necessary to address diverse learning strategies. We construct a BIlingual PEDagogically-informed Tutoring dataset of one-on-one, human-to-human English tutoring interactions.
arXiv Detail & Related papers (2024-06-05T17:49:24Z)
Toward In-Context Teaching: Adapting Examples to Students' Misconceptions [54.82965010592045]
We introduce a suite of models and evaluation methods we call AdapT. AToM is a new probabilistic model for adaptive teaching that jointly infers students' past beliefs and optimize for the correctness of future beliefs. Our results highlight both the difficulty of the adaptive teaching task and the potential of learned adaptive models for solving it.
arXiv Detail & Related papers (2024-05-07T17:05:27Z)
RecExplainer: Aligning Large Language Models for Explaining Recommendation Models [50.74181089742969]
Large language models (LLMs) have demonstrated remarkable intelligence in understanding, reasoning, and instruction following. This paper presents the initial exploration of using LLMs as surrogate models to explain black-box recommender models. To facilitate an effective alignment, we introduce three methods: behavior alignment, intention alignment, and hybrid alignment.
arXiv Detail & Related papers (2023-11-18T03:05:43Z)
SCP: Soft Conditional Prompt Learning for Aerial Video Action Recognition [48.456059482589495]
We present a new learning approach, Soft Conditional Prompt Learning ( SCP), which leverages the strengths of prompt learning for aerial video action recognition. Our approach is designed to predict the action of each agent by helping the models focus on the descriptions or instructions associated with actions in the input videos for aerial/robot visual perception.
arXiv Detail & Related papers (2023-05-21T11:51:09Z)
Exploring Bayesian Deep Learning for Urgent Instructor Intervention Need in MOOC Forums [58.221459787471254]
Massive Open Online Courses (MOOCs) have become a popular choice for e-learning thanks to their great flexibility. Due to large numbers of learners and their diverse backgrounds, it is taxing to offer real-time support. With the large volume of posts and high workloads for MOOC instructors, it is unlikely that the instructors can identify all learners requiring intervention. This paper explores for the first time Bayesian deep learning on learner-based text posts with two methods: Monte Carlo Dropout and Variational Inference.
arXiv Detail & Related papers (2021-04-26T15:12:13Z)
Teaching to Learn: Sequential Teaching of Agents with Inner States [20.556373950863247]
We introduce a multi-agent formulation in which learners' inner state may change with the teaching interaction. In order to teach such learners, we propose an optimal control approach that takes the future performance of the learner after teaching into account.
arXiv Detail & Related papers (2020-09-14T07:03:15Z)
Learning to Reweight with Deep Interactions [104.68509759134878]
We propose an improved data reweighting algorithm, in which the student model provides its internal states to the teacher model. Experiments on image classification with clean/noisy labels and neural machine translation empirically demonstrate that our algorithm makes significant improvement over previous methods.
arXiv Detail & Related papers (2020-07-09T09:06:31Z)
Explainable Active Learning (XAL): An Empirical Study of How Local Explanations Impact Annotator Experience [76.9910678786031]
We propose a novel paradigm of explainable active learning (XAL), by introducing techniques from the recently surging field of explainable AI (XAI) into an Active Learning setting. Our study shows benefits of AI explanation as interfaces for machine teaching--supporting trust calibration and enabling rich forms of teaching feedback, and potential drawbacks--anchoring effect with the model judgment and cognitive workload.
arXiv Detail & Related papers (2020-01-24T22:52:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.