Opinionated practices for teaching reproducibility: motivation, guided
instruction and practice
- URL: http://arxiv.org/abs/2109.13656v2
- Date: Wed, 16 Feb 2022 03:21:17 GMT
- Title: Opinionated practices for teaching reproducibility: motivation, guided
instruction and practice
- Authors: Joel Ostblom, Tiffany Timbers
- Abstract summary: Predictive modelling is often one of the most interesting topics to novices in data science.
Students are not as intrinsically motivated to learn this topic, and it is not an easy one for them to learn.
Providing extra motivation, guided instruction and lots of practice are key to effectively teaching this topic.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the data science courses at the University of British Columbia, we define
data science as the study, development and practice of reproducible and
auditable processes to obtain insight from data. While reproducibility is core
to our definition, most data science learners enter the field with other
aspects of data science in mind, for example predictive modelling, which is
often one of the most interesting topic to novices. This fact, along with the
highly technical nature of the industry standard reproducibility tools
currently employed in data science, present out-of-the gate challenges in
teaching reproducibility in the data science classroom. Put simply, students
are not as intrinsically motivated to learn this topic, and it is not an easy
one for them to learn. What can a data science educator do? Over several
iterations of teaching courses focused on reproducible data science tools and
workflows, we have found that providing extra motivation, guided instruction
and lots of practice are key to effectively teaching this challenging, yet
important subject. Here we present examples of how we deeply motivate,
effectively guide and provide ample practice opportunities to data science
students to effectively engage them in learning about this topic.
Related papers
- The Future of Data Science Education [0.11566458078238004]
The School of Data Science at the University of Virginia has developed a novel model for the definition of Data Science.
This paper will present the core features of the model and explain how it unifies various concepts going far beyond the analytics component of AI.
arXiv Detail & Related papers (2024-07-16T15:11:54Z) - Explainable Few-shot Knowledge Tracing [48.877979333221326]
We propose a cognition-guided framework that can track the student knowledge from a few student records while providing natural language explanations.
Experimental results from three widely used datasets show that LLMs can perform comparable or superior to competitive deep knowledge tracing methods.
arXiv Detail & Related papers (2024-05-23T10:07:21Z) - EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training [79.96741042766524]
We reformulate the training curriculum as a soft-selection function.
We show that exposing the contents of natural images can be readily achieved by the intensity of data augmentation.
The resulting method, EfficientTrain++, is simple, general, yet surprisingly effective.
arXiv Detail & Related papers (2024-05-14T17:00:43Z) - Data Science Education in Undergraduate Physics: Lessons Learned from a Community of Practice [0.6597195879147557]
We present insights and experiences from the Data Science Education Community of Practice (DSECOP)
DSECOP brings together graduate students and physics educators from different institutions to share best practices and lessons learned from integrating data science into undergraduate physics education.
Our goal is to provide guidance and inspiration to educators who seek to integrate data science into their teaching, helping to prepare the next generation of physicists for a data-driven world.
arXiv Detail & Related papers (2024-03-01T20:21:42Z) - Cheap Learning: Maximising Performance of Language Models for Social
Data Science Using Minimal Data [1.8692054990918079]
We review three cheap' techniques that have developed in recent years: weak supervision, transfer learning and prompt engineering.
For the latter, we review the particular case of zero-shot prompting of large language models.
We show good performance for all techniques, and in particular we demonstrate how prompting of large language models can achieve high accuracy at very low cost.
arXiv Detail & Related papers (2024-01-22T19:00:11Z) - Character comes from practice: longitudinal practice-based ethics
training in data science [0.5439020425818999]
We discuss how the goal of RCR training is to foster the cultivation of certain moral abilities.
While the ideal is the cultivation of virtues, the limited space allowed by RCR modules can only facilitate the cultivation of superficial abilities.
Third, we operationalize our approach by stressing that (proto-)virtue acquisition occurs through the technical and social tasks of daily data science activities.
arXiv Detail & Related papers (2024-01-09T09:37:44Z) - Reinforcement Learning from Passive Data via Latent Intentions [86.4969514480008]
We show that passive data can still be used to learn features that accelerate downstream RL.
Our approach learns from passive data by modeling intentions.
Our experiments demonstrate the ability to learn from many forms of passive data, including cross-embodiment video data and YouTube videos.
arXiv Detail & Related papers (2023-04-10T17:59:05Z) - Teaching Visual Accessibility in Introductory Data Science Classes with
Multi-Modal Data Representations [0.0]
We argue that instructors need to teach multiple data representation methods so that all students can produce data products that are more accessible.
As data science educators who teach accessibility as part of our lower-division courses, we share specific examples that can be utilized by other data science instructors.
arXiv Detail & Related papers (2022-08-04T10:20:10Z) - Understanding the World Through Action [91.3755431537592]
I will argue that a general, principled, and powerful framework for utilizing unlabeled data can be derived from reinforcement learning.
I will discuss how such a procedure is more closely aligned with potential downstream tasks.
arXiv Detail & Related papers (2021-10-24T22:33:52Z) - Self-Supervised Representation Learning: Introduction, Advances and
Challenges [125.38214493654534]
Self-supervised representation learning methods aim to provide powerful deep feature learning without the requirement of large annotated datasets.
This article introduces this vibrant area including key concepts, the four main families of approach and associated state of the art, and how self-supervised methods are applied to diverse modalities of data.
arXiv Detail & Related papers (2021-10-18T13:51:22Z) - Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes.
Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.