Related papers: Representational Alignment Supports Effective Machine Teaching

Representational Alignment Supports Effective Machine Teaching

URL: http://arxiv.org/abs/2406.04302v2
Date: Tue, 04 Feb 2025 13:18:51 GMT
Title: Representational Alignment Supports Effective Machine Teaching
Authors: Ilia Sucholutsky, Katherine M. Collins, Maya Malaviya, Nori Jacoby, Weiyang Liu, Theodore R. Sumers, Michalis Korakakis, Umang Bhatt, Mark Ho, Joshua B. Tenenbaum, Brad Love, Zachary A. Pardos, Adrian Weller, Thomas L. Griffiths,
Abstract summary: GRADE is a new controlled experimental setting to study pedagogy and representational alignment.<n>We find that improved representational alignment with a student improves student learning outcomes.<n>However, this effect is moderated by the size and representational diversity of the class being taught.
Score: 81.19197059407121
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A good teacher should not only be knowledgeable, but should also be able to communicate in a way that the student understands -- to share the student's representation of the world. In this work, we introduce a new controlled experimental setting, GRADE, to study pedagogy and representational alignment. We use GRADE through a series of machine-machine and machine-human teaching experiments to characterize a utility curve defining a relationship between representational alignment, teacher expertise, and student learning outcomes. We find that improved representational alignment with a student improves student learning outcomes (i.e., task accuracy), but that this effect is moderated by the size and representational diversity of the class being taught. We use these insights to design a preliminary classroom matching procedure, GRADE-Match, that optimizes the assignment of students to teachers. When designing machine teachers, our results suggest that it is important to focus not only on accuracy, but also on representational alignment with human learners.

Related papers

Automated Visual Attention Detection using Mobile Eye Tracking in Behavioral Classroom Studies [8.576468112095927]
We present an automated processing pipeline concept that requires minimal manually annotated data to recognize which student the teachers focus on.<n>We utilize state-of-the-art face detection models and face recognition feature embeddings to train face recognition models with transfer learning in the classroom context.<n>Our methodology does not require a vast amount of manually annotated data and offers a non-intrusive way of handling teachers' visual attention.
arXiv Detail & Related papers (2025-05-12T13:30:30Z)
Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization [84.86241161706911]
We show that teacher LLMs can indeed intervene on student reasoning to improve their performance. We also demonstrate that in multi-turn interactions, teacher explanations generalize and learn from explained data. We verify that misaligned teachers can lower student performance to random chance by intentionally misleading them.
arXiv Detail & Related papers (2023-06-15T17:27:20Z)
Supervision Complexity and its Role in Knowledge Distillation [65.07910515406209]
We study the generalization behavior of a distilled student. The framework highlights a delicate interplay among the teacher's accuracy, the student's margin with respect to the teacher predictions, and the complexity of the teacher predictions. We demonstrate efficacy of online distillation and validate the theoretical findings on a range of image classification benchmarks and model architectures.
arXiv Detail & Related papers (2023-01-28T16:34:47Z)
Computationally Identifying Funneling and Focusing Questions in Classroom Discourse [24.279653100481863]
We propose the task of computationally detecting funneling and focusing questions in classroom discourse. We release an annotated dataset of 2,348 teacher utterances labeled for funneling and focusing questions, or neither. Our best model, a supervised RoBERTa model fine-tuned on our dataset, has a strong linear correlation of.76 with human expert labels and with positive educational outcomes.
arXiv Detail & Related papers (2022-07-08T01:28:29Z)
Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation [70.92135839545314]
We propose the dynamic prior knowledge (DPK), which integrates part of teacher's features as the prior knowledge before the feature distillation. Our DPK makes the performance of the student model positively correlated with that of the teacher model, which means that we can further boost the accuracy of students by applying larger teachers.
arXiv Detail & Related papers (2022-06-13T11:52:13Z)
Generalized Knowledge Distillation via Relationship Matching [53.69235109551099]
Knowledge of a well-trained deep neural network (a.k.a. the "teacher") is valuable for learning similar tasks. Knowledge distillation extracts knowledge from the teacher and integrates it with the target model. Instead of enforcing the teacher to work on the same task as the student, we borrow the knowledge from a teacher trained from a general label space.
arXiv Detail & Related papers (2022-05-04T06:49:47Z)
Know Thy Student: Interactive Learning with Gaussian Processes [11.641731210416102]
Our work proposes a simple diagnosis algorithm which uses Gaussian processes for inferring student-related information, before constructing a teaching dataset. We study this in the offline reinforcement learning setting where the teacher must provide demonstrations to the student and avoid sending redundant trajectories. Our experiments highlight the importance of diagosing before teaching and demonstrate how students can learn more efficiently with the help of an interactive teacher.
arXiv Detail & Related papers (2022-04-26T04:43:57Z)
A teacher-student framework for online correctional learning [12.980296933051509]
We show that the variance of the estimate of the student is reduced with the help of the teacher. We formulate the online problem - where the teacher has to decide at each time instant whether or not to change the observations. We validate the framework in numerical experiments, and compare the optimal online policy with the one from the batch setting.
arXiv Detail & Related papers (2021-11-15T15:01:00Z)
Iterative Teacher-Aware Learning [136.05341445369265]
In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency. We propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function.
arXiv Detail & Related papers (2021-10-01T00:27:47Z)
Representation Consolidation for Training Expert Students [54.90754502493968]
We show that a multi-head, multi-task distillation method is sufficient to consolidate representations from task-specific teacher(s) and improve downstream performance. Our method can also combine the representational knowledge of multiple teachers trained on one or multiple domains into a single model.
arXiv Detail & Related papers (2021-07-16T17:58:18Z)
The Wits Intelligent Teaching System: Detecting Student Engagement During Lectures Using Convolutional Neural Networks [0.30458514384586394]
The Wits Intelligent Teaching System (WITS) aims to assist lecturers with real-time feedback regarding student affect. A CNN based on AlexNet is successfully trained and which significantly outperforms a Support Vector Machine approach.
arXiv Detail & Related papers (2021-05-28T12:59:37Z)
Interactive Knowledge Distillation [79.12866404907506]
We propose an InterActive Knowledge Distillation scheme to leverage the interactive teaching strategy for efficient knowledge distillation. In the distillation process, the interaction between teacher and student networks is implemented by a swapping-in operation. Experiments with typical settings of teacher-student networks demonstrate that the student networks trained by our IAKD achieve better performance than those trained by conventional knowledge distillation methods.
arXiv Detail & Related papers (2020-07-03T03:22:04Z)
Understanding the Power and Limitations of Teaching with Imperfect Knowledge [30.588367257209388]
We study the interaction between a teacher and a student/learner where the teacher selects training examples for the learner to learn a specific task. Inspired by real-world applications of machine teaching in education, we consider the setting where teacher's knowledge is limited and noisy. We show connections to how imperfect knowledge affects the teacher's solution of the corresponding machine teaching problem when constructing optimal teaching sets.
arXiv Detail & Related papers (2020-03-21T17:53:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.